Have one more question in mind, if we really construct the correlation structure. As far as I understand, to put it simple, it relies on historical data to build some statistics. Then, to put it further simple, the VaR is just a statistical value derived from historical data, is it ?
If so, why don't I use historical simulation to calculate the VaR directly ?
As you know, the drawback of MC is the computational consumption.
For every method, there must be pros & cons. How is the practice in industry ?
Good question. There are a number of issues relating to historical data, some of which Pat has touched on. One major one is the scarcity of data--or relevant data, anyway.
Typically, for historical VaR you generate historical transitions in the needed market data, treat these as samples from their "real" distributions, shock today's market data and reprice each time to get P&L's. Once you have P&L's, you look for the mth worst realization, with m chosen according to the number of samples (mainly a function of your lookback period) and your chosen "significance level," which is sort of a terrible misnomer but is nevertheless the standard term.
What you're looking at is the mth order statistic of a sample from some distribution, and like any statistic it is itself random. The farther out in the tail you go, the greater its variance. Not only that, but the particular shape of the distribution--which is generally not known--has a lot to do with how (potentially) bad this statistic is as a measure of the "actual" P&L in question, and it gets worse when distributions have fatter lower tails. 95% VaR estimated on a normal distribution with a typical year's worth of info--250 data points or so--is pretty good as these things go: You have decent confidence that your estimate lies within half a standard deviation of the source distribution (!) of the "actual" value.
Moreover, such point estimators are biased; again, the amount of the bias depends upon the source distribution. (Funnily, in the literature this estimate is most often described as "asymptotically unbiased," meaning that as your number of samples approaches infinity, the bias goes away. Omniscience is nice, no?)
By comparison, an estimate of a distribution's mean and variance based on a sample of size 250 seems quite solid. Any error in this estimate or in your choice of distribution will of course be reflected in the MC VaR you end up calculating as a result, but assuming your choice of distribution is correct or reasonably so (a big if) you can actually do better than the raw point estimate via MC.