This is the final project for Econometrics Class
Does the size of a site determine the quality of water management? I study determinants of overwatering for commercial and public sites in Northern California. I investigate how total landscaped area affects stakeholder’s engagement with a water management program. I find that larger sites are much more likely to engage with the program. Using panel data, I also analyze the effect of a water management program and drought effects on water use. I estimate the effect of the conservation program is between 10-18 percentage points decrease in likelihood to overwater, and drought effects of 1.3-3 percentage points. (JEL D83 O22 Q25)
- Introduction
California experienced the longest drought in its history from 2011 to 2019, reaching its peak in 2014-16. Managing the water used for irrigation is an important component in managing scarce water supplies. In 2017, California lifted the state of emergency as water conditions improved and water use rebounded. However, droughts will return, and efficient water management is still crucial to the state.
The purpose of my study is to investigate how certain factors impact the success of water conservation programs targeting commercial and public landscape irrigation. I use data provided by Waterfluence, a firm partnering with municipal water agencies in California to assist commercial and public sites with monitoring their water use. Waterfluence provides site stakeholders with reports showing how actual water use compares to a budget benchmark based on site-specific characteristics and real-time weather. The sites in this study are located in the Bay Area and account for a large portion of total landscape water use.
I find the sites most likely to over water are small, commercial sites. Depth of overwatering is inversely correlated with the size of the site. It is exactly these small commercial sites who are most affected by drought policies. I hypothesize this is because smaller sites have the burden of abundance. During normal circumstances, small sites over water without much scrutiny because a small amount of overwatering has a low total cost to the stakeholder who pays the water bill. Whereas any inefficiency at a larger site would amount to a much higher cost. So, larger sites feel a sense of scarcity even in non-drought years because of the large magnitude of their water bills.
- Literature Review
Shah et al. (2015) found that scarcity frames decision making. In a series of studies, they found that value is shaped by scarcity, and when someone feels scarcity her value of a good is closer to an economic model. In my study, I posit that small commercial overwater because they do not feel a sense of scarcity in non-drought years. Rather, they feel an abundance of water supply because their overages are less costly. Shah et al. argue scarcity focuses the mind and improves someone’s ability to efficiently allocate resources. I contend that water use is one of the cases in which scarcity eliminates context effects and improves people’s decision making.
In other studies, researchers have found inattention leads to suboptimal decision making (Calpan, Dean 2015; Matˇejka, Mckay 2015). In an experiment with blueberries, Shah, Mulletian, and Shafir, (2012) found that scarcity demands attention, and improved people’s ability to perform a challenging task. In the context of water budgeting, careful attention produces the best results. Attentive stakeholders and landscapers will look at Waterfluence’s reports and use that knowledge to optimize their water use in every period. I find how scarcity affects the attention and management of irrigation landscapes.
Karlan et al. (2016) found that simple messages encouraging saving increased commitment of a savings program offered by banks. Similarly, all sites in Waterfluence’s program receive monthly reports reminding them of their water use and ways of watering efficiently. However, in their experiment, the sample included people who committed to a saving program. Whereas, in my sample, all participants are not necessarily intrinsically motivated to conserve water because individual sites don’t elect to be in the program. Waterfluence partners and is paid by the water agencies for the service, and sites receive individual information regardless of their motivation to irrigate efficiently. Karlan et al. also used a variety of messaging and therefore were able to control for incentive type effects. Waterfluence reports are similar for everyone who receives them and only contains information on site-specific water-use, and recommendations, so I can’t control for different forms of motivation.
- Data and Descriptive Statistics
The dataset includes 4,963 commercial and public landscape sites in the San Francisco Bay Area. Data include site characteristics, stakeholder interactions, and historical water use. Key summary statistics are in Table One. The program’s main performance metric is minimizing the depth of overwatering: volume of actual water minus the budgeted water divided by irrigated area. This metric is area-adjusted and weather-normalized enabling year-to-year and site-to-site comparisons. The metric views per year is defined by how many months out of the year any site stakeholder views the online report. I have data on water use from 2017 to 2019 and some historical data from the year before the site joined Waterfluence any time since 2002.
Table 1: Summary Statistics

NOTE: Standard deviations in parenthesis. Medians are below standard deviation. The non-waterfluence statistics are from varying years which may affect the depth-overwatering, and the numbers reported are overestimates because any underwatering is reported as a 0.
There are 5615 sites, but I had to exclude 173 sites because of irregularities such as redevelopment. I also excluded sites from Milpitas because they are new to the program (2019) and have no viewership data. I also excluded 13 golf sites because they are large and professionally managed differently from all other sites. I have 1-4 years of available data for each site — average 3.2 years–, and 98% of all the sites have historical data on water usage before they started using Waterfluence.
There are a variety of site types, and each type differs in how they manage landscaping. Commercial sites, such as HOAs and offices, account for 76% of sites and 83% of water use and are often managed by independent landscape contractors. Public sites include primarily parks and schools and are often managed by their in-house staff. Public sites are on average larger and have higher portions of turf. They also view Waterfluence online 4 months out of a year on average and the mean depth overwatering is -0.273ft — meaning public sites on average underwater and don’t overwater. On the other hand, small commercial sites on average overwater by 2.31ft and visit the website much less often than public sites.
Waterfluence distributes monthly landscape reports to customers by mail or by online access to its website. The online content has more depth and allows multiple stakeholders, such as HOA board members, park staff, and landscape contractors, to view site information. In 2019, out of the stakeholders that don’t receive mailed paper reports, at 66% of sites actively viewed program information via the Waterfluence website.
In 2019, for sites in the Waterfluence program overwatering averaged 1.343 feet over all irrigated landscapes. Of all the sites, 60% of sites somewhat overwatered, and 18% severely over-watered by more than 3 feet. Of the sites that overwatered, more than half were small commercial sites.
Graph 1: History of Overwatering in BAWSCA and Historical Drought
Notes: Overwatering from BAWSCA sites, and Drought data from US Drought Monitor. Drought is measured in percent of California under D0-D4 drought conditions. I took a weighted sum and averaged the sum to get a measure of annual drought severity.
Historically, overwatering has decreased significantly since 2002 for sites using Waterfluence. Graph one illustrates this trend. This pattern is related to regulation on water use by state policymakers in response to droughts. Overwatering reached its lowest point in 2015 after Governor Jerry Brown declared a state of emergency. Since then, California lifted the state of emergency and overwatering is rising.
- Econometric Model and Results
To measure how landscape size affects engagement, I use two different measures of size and two measures of engagement. The first dependent variable is the number of active contacts each site has listed. Each active contact for a site gets a monthly report through online access. The second dependent variable is viewership which is defined by how many months out of a year any active contact looked at the online maps. So the maximum is 12 if they viewed the site every month. Very few sites did.
(2) Active Contacts i = 𝜷0 + 𝜷1(Size Group)i + 𝜷2 (Commercial Dummy)i + 𝜷3(Turf Portion)i + 𝜷4 (days in program)i + 𝜷5 (Mail dummy)i + (Water Agency)i 𝑢i
The variable of interest is size. I first use a dummy variable which is a 1 is the size of the area is more than an acre of land to compare the large to small sites. Sites are composed of turf area and shrub area which require different amounts of water and management. I control for the plant-type make-up by controlling for the portion of the area that is turf. I also control for the number of days the site has been in the Waterfluence program and if they receive paper reports in the mail or not. Commercial is a dummy variable if the site is commercial or public. Water Agency is a set of dummies controlling for the 26 different water agencies in the sample.
(4) Active Contacts i = 𝜷0 + 𝜷1ln(Area)i + 𝜷2 (Commercial Dummy)i + 𝜷3(Turf Portion)i + 𝜷4 (days in program)i + 𝜷5 (Mail dummy)i + (Water Agency)i + 𝑢i
In Model(4), I measure size using the log of the total area in square feet instead of a dummy.
(5) Viewership i = Active Contacts = 𝜷0 + 𝜷1(Size Group)i + 𝜷2 (Commercial Dummy)i + 𝜷3(Turf Portion)i + 𝜷4 (days in program)i + 𝜷5 (Mail dummy)i + 𝜷6-32 (Water Agency)i + 𝑢i
In Model(5-6), Viewership is the dependent variable instead of active contacts. For these regressions, I drop sites that receive reports by mail because it’s unknown if they looked at the report or not. Model (7-10) I run all the regressions with only data from commercial sites.
Table 2: Measuring the Effect of Size on Engagement:

Notes: 2019 data. Standard errors are clustered by the water agency. Controls turf portion, days in program, mail, and wateragency. Controls for (11) and (12) include detailed site type controls such as HOA, Office building ect. All sites that receive information by mail are omitted for viewership if they do not have sites online. Column 9 and 10 only include commercial sites. Results for active contacts and and viewership without controls are similar magnitude and significance if only including commercial sites Controls *, **, ** are 10%,5% and 1% significance.
Results from Table 1 indicate that larger sites are more likely to engage with Waterfluence. All coefficients on size are positive and most are significant at the 1% level suggesting that the effect of a larger site is associated with an increase in online viewership and number of active contacts. A site larger than an acre is predicted to view online sites 0.363 more times per year relative to a small site. This is not a large magnitude but is an improvement in viewership considering more than half the sites never look at the online sites. Similarly, a site that is 10% larger is predicted to have 0.138 more active contacts. This result is significant at the 1% level, controlling for site type, days in the waterfluence program, plant type makeup, whether they receive mail reports or not, and water agency. Standard errors are clustered by the water agency to mitigate spatial autocorrelation.
Commercial sites relative to public sites are predicted to have 0.512 less active contacts and view the online site about 2 fewer times a year. The median is 1 view per year for commercial sites, and 4 views for public sites. This is more than a standard deviation difference. However, among only commercial sites, there are still positive significant relationships between size and viewership and active contacts. The results summarized in table two indicate that small commercial sites are least likely to engage with Waterfluences platforms. There are 2,378 small commercial sites in Waterfleunce’s Bay Area sites. This is almost half the total sites in 2019. The inefficiency of this group has significant aggregate effects on water conservation for the area.
I use the number of water meters at a site as an instrumental variable. The number of meters is highly correlated with size because bigger sites need more meters. A regression of log of area on count of meter reveals that the relationship is significant at the 1% level. The effect of the count of meters on size as a dummy and size measured in logs of the area are both significant. The p-value is less than 0.0001, so the count of meters is relevant. Count of meter is also not related to anything in the error term like landscaper quality or water price. The city puts in the meters so they can bill sites, so it has nothing to do with stakeholders or site characteristics. The J-test statistics are insignificant for regressions using active contacts or viewership, using only commercial or full data sets. With an instrumental variable, the effect size for viewership on size is 2.5 and is significant at the 1 % level. Similarly, most coefficients when using the instrumental variable increase in effect size by .5-4 times. The difference in effect size indicates there may be a lot of endogeneity, and that size is a strong indicator of engagement.
Price of water, quality of equipment such as the spray heads, technology such as automated controllers that are weather adjusted, landscaper quality, and environmental attitudes of stakeholders are left in the error term of the simple OLS and 2SLS regressions. I can control for differences in sites that are time-constant using panel data from the year before a site joined the Waterfluence program.
(2) Over Watering i = 𝜷0 + 𝜷1(Waterfluence dummy)i + 𝜷2 (Drought Severity)i + 𝜷3(turf portion)i + 𝜷4 (days in program)i + 𝑢i
Waterfluence is a dummy variable that equals 1 if the site uses Waterfluence in that year. Drought severity is a statewide year fixed effect. Drought severity is measured annual averaged of the weighted sum of drought percent area of California rounded to the nearest integer. It is weighted by the severity of the drought D0-D4. Although drought severity is a statewide measure, it is a good measure for area-specific drought effects because policies on water-use are made by the governor at the state level. Model(3) includes 3 commercial-size-type dummies that control commercial and public types as well as size groups.
(4) Over Watering i = 𝜷0 + 𝜷1(Years in Program)i + 𝜷2 (Drought Severity)i + 𝜷3(turf portion)i + 𝑢i
Model (4) includes only data from Estero Water Agency in Foster City because I have 13 years of data from 220 sites. The variable of interest is time in Waterfluence’s program in years.
(6) Over Watering i = 𝜷0 + 𝜷1(Waterfluence)i + 𝜷2-4 (Commercial-Size dummies)i + 𝜷5(Turf Portion)i + 𝜷6 (days in program)i + 𝜷7 (Water Agency)i + 𝑢i
Models(5)-(7) are cross-section data in 2017. I chose 2017 because 40% of the sites are not in Waterfluence when controlling for site characteristics, the sites that joined Waterfluence in 2018 are a good control for sites using Waterfluence in 2017. Model (7) includes an interaction term between Waterfluence and small-commercial dummy.
Results from Table three show that sites in Waterfluence overwater about 18.0-.18.4% percentage points less holding constant all time-constant variables such as the area of the site, site type, or landscaper quality, plant type portion, paper vs digital reports, and drought policies, and that is significant at the 1% level. The effect is less for small commercial sites. Results from column (4) reveal that for every additional year in Waterfleunce’s program the likelihood to overwater decreases by 2.1 percentage points controlling for drought policies.
Table 3: Linear Probability Regression On Effect of Waterfluence on Overwatering

Notes: Standard errors are clustered by Wateragency and Sites. F-tests are assymoticiaclly correct chi-sqaured values. In model 4 and 5 the omitted dummy is Large Public. Site effects include days in the program, turf portion, and water agency. Year fixed effects include days in the program and turf portion.
Lastly, the results from column (6-7) are most shocking. Regressions using only 2017 indicate the effect of Waterfluence completely eliminates the likelihood to overwater, significant at the 10% level. For small commercial sites, the effect of Waterfluence is an additional 7.5 percentage points decrease in likelihood to overwater. This negative effect indicates that Waterfluence has more impact on these sites. I tested the sensitivity of the estimate in model (6) if the true mean is -0.836. I choose an effect size of the estimated effect for large public sites compared to small commercial sites which is 22.4 percentage points less than the estimated 106. The power of the model is 99% so, the model will accept a false value 1.06 if the true value is -0.836 only 0.01% of the time. The power of this model is strong because the sample size is so large.
Assuming that sites using Waterfluence would behave like sites not using Waterfluence without the service, then I can use a difference-in-difference estimation of the effect. This is a valid assumption because individual sites do not sign up to be in the program, and the control is composed of sites that will join Waterfluence the next year, so there is no selection bias assuming that all water agencies join Waterfluence for the similar reasons year-to-year. Using a difference-in-difference estimate, I found that the effect of Waterfluence on small commercial sites is a 9.5 percentage point decrease in the likelihood of overwatering. These results are similar to the estimate using an interaction effect. Overall, results indicate that although Waterfluence has some effect on small commercial sites, small commercial sites are still much more likely to overwater than other types of sites.
Omitted Variable bias is a concern. One observable omitted variable is the price of water which has only increased in the past twenty years. Although price of water is correlated with drought severity, there is still a significant direct effect on water use that is left in the error term of my models. Another endogeneity problem is that the panel data for (1) and (2) is unbalanced. I only have data for the sites in the Waterfluence program from 2017-2019 and all previous data is on sites before they joined Waterfluence. This selection bias means that the effect of drought severity is measured using mostly only data of the sites before they joined Waterfluence because there is little variation in drought severity 2017-2019 but lots of variation in years previous. Measurement error is also a concern. I could use a more specific indicator for drought and measure depth over-watering rather than using a dummy variable. Additionally, panel data still might not capture landscape quality because we don’t know if sites switch landscapes, and landscapers usually change contracts every 3 three years. My preferred identification strategy is to have data on individual landscapers, and have complete data for all sites for the past twenty years.
My results are consistent with Saha et al 2013. Scarcity improved the performance of all sites in the form of drought-related policies and scarcity of water in non-drought years in the form of costs. Size is a strong indicator of engagement with the program and better performance in water management. Similar to the findings of Kaplan et. al., Waterfluence invention has a significant effect on improving water-usage, and even more so for small commercial sites. I would suggest further experimentation into the effect of goal-specific messaging for these small commercial sites such as those used in Karlan et al.
REFERENCES
Caplin, Andrew, and Mark Dean. 2015. “Revealed Preference, Rational Inattention, and Costly Information Acquisition.” American Economic Review, 105 (7): 2183-2203.
Karlan, D., McConnell, M., Mullainathan, S., & Zinman, J. (2016). Getting to the top of mind: How reminders increase saving. Management Science, 62(12), 3393-3411.
Matějka, Filip, and Alisdair McKay. 2015. “Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model.” American Economic Review, 105 (1): 272-98.
Mullainathan, S., & Shafir, E. (2013). Scarcity: Why Having Too Little Means So Much.
Shah, A. K., Shafir, E., & Mullainathan, S. (2015). Scarcity Frames Value. Psychological Science, 26(4), 402–412. https://doi.org/10.1177/0956797614563958
Shah, A. K., Mullainathan, S., & Shafir, E. (2012). Some Consequences of Having too Little. Science, 338(6107), 682-685.
US Drought Monitor (2020) “Drought in California from 2000 – 2020.” National Integrated Drought Information. retrieved from https://www.drought.gov/drought/states/californiaWhitcomb, John. (2019). “2019 BAWSCA Annual Report Large Landscape Program.” Waterfluence LLC. Not available in the public domain.