Heteroscedastic regression models are used in fields including economics, engineering, and the biological and physical sciences. Often, the heteroscedasticity is modeled as a function of the covariates or the regression and other structural parameters. Standard asymptotic theory implies that how one estimates the variance function, in particular the structural parameters, has no effect on the first-order properties of the regression parameter estimates; there is evidence, however, both in practice and higher-order theory to suggest that how one estimates the variance function does matter. Further, in some settings, estimation of the variance function is of independent interest or plays an important role in estimation of other quantities. In this article, we study variance function estimation in a unified way, focusing on common methods proposed in the statistical and other literature, to make both general observations and compare different estimation schemes. We show that there are significant differences in both efficiency and robustness for many common methods. We develop a general theory for variance function estimation, focusing on estimation of the structural parameters and including most methods in common use in our development. The general qualitative conclusions are these. First, most variance function estimation procedures can be looked upon as regressions with responses being transformations of absolute residuals from a preliminary fit or sample standard deviations from replicates at a design point. Our conclusion is that the former is typically more efficient, but not uniformly so. Second, for variance function estimates based on transformations of absolute residuals, we show that efficiency is a monotone function of the efficiency of the fit from which the residuals are formed, at least for symmetric errors. Our conclusion is that one should iterate so that residuals are based on generalized least squares. Finally, robustness issues are of even more importance here than in estimation of a regression function for the mean. The loss of efficiency of the standard method away from the normal distribution is much more rapid than in the regression problem. As an example of the type of model and estimation methods we consider, for observation-covariate pairs (Yi, xi), one may model the variance as proportional to a power of the mean response, for example, (Equation presented) Where f(xi, ) is the possibly nonlinear mean function and is the structural parameter of interest. Regression methods for estimation of and based on residuals (Equation presented) for some regression fit (Equation presented) involve minimizing a sum of squares where some function T of the riplays the role of the responses and an appropriate function of the variance plays the role of the regression function. For example, if T(x) = x2, the responses would be ri2, and the form of the regression function would be suggested by the aproximate fact (Equation presented). One could weight the sum of squares appropriately by considering the approximate variance of ri2. For the case of replication at each xi, some methods suggest replacing the ri, in the function T by sample standard deviations at each xi. Other functions T, such as T(x) = x or log x have also been proposed. 1976 Taylor & Francis Group, LLC.