Cutting Census Block Data to a Study Area

Cutting Census Block Data to a Study Area

I'm trying to cut census block data to a study area. I have a study area polygon (made from selecting whole counties of interest) and clipped the census data by this. When I visited the new Census file I found this process has created "slivers" from the original census dataset that are very small (area is less than a mile) but its population is in the thousands; the entire population of the original block has been cramped into this tiny area. I had thought clipping the data by the study are would've been successful since block groups, at least to my knowledge, do not cross counties. Is there any way to easily remove these extraneous slivers or a better way to cut the census data to a study area?

The problem is probably that your County boundaries and your Census blocks are not identical and don't line up right leaving you with slivers. Instead of clipping, You could just query out the Counties you would like to use, within the study area, using the "COUNTYCU" field in the attribute table. This field gives a unique identifier to each County. Using this identifier would ensure that you are using blocks that are within your study area only.

A Google Street View analysis of gentrification: a case study of one census tract in Northside, Cincinnati, USA

Since the 1990s, the gentrification process in the U.S. has diffused down the urban hierarchy and to neighborhoods further from downtown. This paper focuses on one census tract in the inner-ring suburbs of one mid-sized city in the U.S. undergoing gentrification between 2000 and 2016. While the use of census data to measure gentrification has been around for decades, a new tool, Google Street View, can now supplement the measurement of gentrification. Using census data, Google Street View imagery, and Hwang’s (Gentrification, race, and immigration in the changing American city, Harvard University, Cambridge, 2015) gentrification index, I document change in the built environment of the neighborhood of Northside (Tract 74) in the city of Cincinnati. This tract’s accessible location to downtown Cincinnati and the University of Cincinnati/Hospital Complex as well as the availability of high-quality, low-cost housing made it a focal point for individual, corporate, and government investment after 2000. By 2014/16, that investment had transformed the built environment of Tract 74. There is no evidence that the Great Recession slowed the gentrification process as Tract 74 transitioned from disinvested in 2011 to early stage gentrification by 2014/16.

This is a preview of subscription content, access via your institution.

Participant Statistical Areas Program

The Basics
The 2020 Census Participant Statistical Areas Program (PSAP) offers local governments, councils of government, and regional planning organizations the opportunity to review and modify select statistical boundaries that the U.S. Census Bureau uses to count people in our community.

PSAP in the Triangle
By participating in the PSAP, the Triangle region will ultimately be provided with the most relevant, useful data possible about population, income, and housing for small-area geographic analyses. The Census Bureau uses statistical boundaries to tabulate data for the 2020 Census, American Community Survey, Economic Census, and other surveys. Data tabulated to PSAP geographies are used by federal, state, and local agencies for planning and future purposes, as well as by the private sector, academia, and the public. Standard statistical geographies include: Census Tracts, Census Block Groups, and Census Designated Places (CDPs).

Outreach & Coordination with Our Partners
Triangle J COG worked with its seven counties and other interested local parties to ensure that the region’s priorities were appropriately considered in the delineation and verification of the statistical boundaries. Census data for updated statistical areas are used to prepare grant applications to fund community and regional development, education, agriculture, energy, and environmental programs, as well as other needed community improvements and enhancements. Census data are used to plan for future community needs, which necessitated outreach and engagement across multiple county departments, such as planning, public works, transportation, and GIS/information technology. In coordination with the counties, MPOs, and RPO, TJCOG submitted the final PSAP package to the Census Bureau on May 15, 2020. The Census Bureau will begin to release geographies from the 2020 Census in December 2020. The next opportunity to review and delineate statistical areas is planned for the 2030 Census.

Outcomes from PSAP Delineation & Verification
In coordination with local government and transportation planning agencies, TJCOG delineated and verified Census statistical boundaries for our seven counties. In addition, TJCOG and CAMPO staff coordinated closely with the Kerr-Tar Regional COG to make sure TAZs in Franklin, Granville, and Person figure into Kerr-Tar’s work. Ultimately, Triangle J COG submitted 119 tracts and 430 block groups as additions to the 2010 Census Geographies, which totals to 480 tracts and 1,299 block groups in the Triangle region. Statistical boundaries, such as tracts and block groups, break down large geographical areas into smaller, local areas. The increase in both tracts and block groups will provide TJCOG and its member communities with the most relevant, useful data possible about population, income, and housing in the Triangle region.

A full breakdown of TJCOG's PSAP Program process, timeline, and outcomes is available in the 2020 Census Participant Statistical Areas Program handout, drafted on May 29, 2020.

2020 Participant Statistical Areas Materials

TJCOG Activities

Quick Program & Reference Guides

Use TIGERweb to review the 2020 statistical areas

Refer to the PSAP Verification Quick Program and Respondent Guide for instructions on how to use TIGERweb to review 2020 PSAP statistical areas.

Electronic Availability:

CensusCD 1980 [electronic resource]
New Brunswick, N.J. : GeoLytics, Inc., 2000
Version 2.0
CensusCD 1980, allows access to the complete results of the 1980 US census, down to the census tract. Over 2,500 demographics, and geographic identifiers exist for every geographic area. A full set of 1980 maps, along with mapping software, have also been included.
find in the library

CensusCD 1980 [electronic resource]
New Brunswick, N.J. : GeoLytics, Inc., c1999
Version 1.0
"CensusCD 1980, is the first product to allow access to the complete results of the 1980 US census, down to the census tract. Over 1,500 demographics, and geographic identifiers exist for every geographic area. Over 50 demographics from the 1990 census, have been converted back to 1980 geographic areas. And a full set of 1980 maps, along with mapping software, have been included."--Introd.
find in the library

CensusCD blocks [electronic resource] : complete U.S. block data and maps
[East Brunswick, New Jersey] : GeoLytics, 1998
Version 1.0
" . contains all the population and housing data from the Census Bureau's STF 1B and PL94-171 files, the latest TIGER boundaries, and over 50 geographic identifiers, including 1980 FIPS codes, and Zip Code to census block relationships"--User guide, leaf 2.
find in the library

CensusCD neighborhood change database (NCDB) [electronic resource] : 1970-2000 tract data : selected variables for US Census tracts for 1970, 1980, 1990, 2000 and mapping too!
Long form release 1.0
E. Brunswick, NJ : GeoLytics, c2003
Also available on 1 CD-ROM : col. 4 3/4 in.
Contains nation-wide tract-level data from the 1970, 1980, 1990 and 2000 decennial censuses. Combines U.S. Bureau of the Census data into one product with variables and tract boundaries that are consistently defined across census years.
find in the library

Utilizing geospatial analysis of U.S. Census data for studying the dynamics of urbanization and land consumption

Geographically referenced US census data provide a large amount of information about the extent of urbanization and land consumption. Population count, the number of housing units and their vacancy rates, and demographic and economic parameters such as racial composition and household income, and their change over time, can be examined at different levels of geographic resolution to observe patterns of urban flight, suburbanization, reurbanization, and sprawl. This paper will review the literature on prior application of census data in a geospatial setting. It will identify strengths and weaknesses and address methodological challenges of census-based approaches to the study of urbanization. To this end, a detailed overview of the geographic structure of U.S. Census data and its evolution is provided. Ecological Fallacies and the Modifiable Areal Unit Problem (MAUP) are discussed and the Population Weighted Density as a more robust alternative to crude population density is introduced. Of special interest will be literature comparing and/or integrating census data with alternative methodologies, e.g. based on Remote Sensing. The general purpose of this paper is to lay the groundwork for the optimal use of high resolution census data in studying urbanization in the United States.

Sprawl, Urban sprawl, City, Population Density, Population Weighted Density, Census, US Census, Census Geographies, Urbanization, Suburbanization, Urban flight, Reurbanization, Land Consumption, Land Use, Land Use Efficiency, LULC, Remote Sensing, Geospatial Analysis, GIS, Growth, Urban Growth, Spatial Distribution of Population, City Limits, Urban Extent, Built Environment, Urban Form, Areal Interpolation, Scale, Spatial Scale, Longitudinal Study, Dasymmetric Mapping, Ecological Fallacy, MAUP, Modifiable Areal Unit Problem, Metrics

How census tract changes can be used for financial gain.

Here is an interesting story on how census geography can become very politicized.

there is concern about a boundary change in Pittsburgh‘s downtown census tract (305) during the 2020 census. Part of the tract was allocated to the neighboring Hill district to try and secure a Federal Opportunity Zone to bring investment into the neighborhood. This designation would allow private businesses to take advantage of tax breaks by investing in the area.

So, what’s the problem? Some people in the community are weary of any new development in the district that may bring more affluent residents. This area is the historical center of the African American community in Pittsburgh. ( The most vocal group is the Hill Community Development Corporation. They are worried that a change in the demographics may be detrimental tp the neighborhood because any increase in the income level would make the Hill ineligible for affordable housing and other community development efforts (such as the HUD’s Community Block Grant Program)

According to PublicSource, there were communications between the Pittsburgh Penguins organization and the State of Pennsylvania’s Department of Community and Economic Development (DCED) to try and make the area eligible for an Empowerment Zone. Also, the Pittsburgh mayor’s office contacted the Internal Revenue Service regarding a census boundary change to include the parts of the Hill area into tract (the section that included the hockey stadium).

Right now as it stands there will be a meeting held by the Hill CDC at the request of the Pittsburgh Penguin organization to discuss the matter. That meeting is set for March 15, 2021.

Let’s back up a bit and start with what is a census tract and what is the review process.

According the the US Census Bureau a tract is a “relatively permanent small-area geographic divisions of a county or statistically equivalent entity defined for the tabulation and presentation of data from the decennial census and selected other statistical programs”. Every ten years during the decennial census the Census Bureau’s Geography Division conducts a program called the Participant Statistical Areas Program (PSAP) . According the the 11/2018 final Federal Register Notice PSAP allows “designated governments or organizations an opportunity to review and, if necessary, suggest updates to the boundaries and attributes of the census tracts in their geographic area through the Participant Statistical Areas Program (PSAP). The program also encompasses the review and update of census block groups, census designated places, and census county divisions.”.

So if they were designated, the city of Pittsburgh and Allegheny county were able to create and review any changes to census tracts, census block groups, census designated places (CDPs) and census county divisions before they were published. The reviewer(s) would have looked over the changes and contacted the Census Bureau if they had any concerns or changes to the geographies. Any changes to Tract 305 went thru a local and Federal process. The tone of the article seems to want to place blame on the Census for the boundary change. (If there were no local review then the any proposed change by the Census would have gone through). The Census Bureau has yet to comment on the matter.

After reading the story I have some questions.

  • Was there a local review of the tract?
  • Who were the reviewers?
  • If not, why didn’t they publicize it to the public?
  • Why was interference by a private organization entertained?

Let’s hope that we get some answers.


Newest First Oldest First Newest First Most Liked Least Liked Preview Post Comment…NextIntroducing my podcast: Data For Everyonedata, podcast

Estimating Household Transportation Behavior and Housing Costs

Once these models have been developed, we can use them to estimate average autos per household and vehicle miles traveled and the percent of commuters using transit for all 198,373 Census block groups covered by the Index. This is accomplished by plugging data for each of the 15 predictor variables for each block group into both the SEM and the VMT model.

Figure 3: General Method for Estimating Transportation Usage from Regression Models

Most of the input variables come from data that describe features of a neighborhood that are common to everyone who lives there: population density, walkability, transit access and quality, and employment access (these are all features of the built environment). For inputs that identify characteristics the residents themselves--household size, income, and number of commuters--using actual data for each block group wouldn&rsquot produce a very useful Index. Since people tend to live in places they can afford, using actual demographic data would produce a map where the majority of neighborhoods look more or less affordable. Instead, we have chosen eight household profiles&mdashcharacterized by the number of family members, income, and number of commuters&mdashthat represent a wide range of American families, providing useful insight on affordability for a variety of different users, including consumers, planning agencies, real estate professionals, and housing counselors.

Table 3: Household Profiles Used for Estimating Transportation Usage
Household Profile Income Size # of Commuters
Median-income family MHHI 4 2
Very low-income individual National poverty line 1 1
Working individual 50% of MHHI 1 1
Single professional 135% of MHHI 1 1
Retired couple 80% of MHHI 2 0
Single-parent family 50% of MHHI 3 1
Moderate-income family 80% of MHHI 3 1
Dual-professional family 150% of MHHI 4 2

MHHI = Median household income for a given CBSA

Each CBSA and rural county has a unique set of household profiles. We use these regional profiles in combination with the model&rsquos block-group-level input variables to estimate household housing and transportation costs. Figures 3 and 4 illustrate how this is done for homeowners and renters, respectively, using the SEM, which estimates the number of autos per household, the percentage of commuters using transit, and housing costs for each Census block group and non-metropolitan county. This results in 16 sets of estimates for these 3 variables: a set of homeowners and renters for each of the eight household profiles.

Figure 4: Estimating Autos per Household, the Percentage of Commuters Using Transit, and Housing Costs for Eight Different Homeowner Household Profiles
Figure 5: Estimating Autos per Household, the Percentage of Commuters Using Transit, and Housing Costs for Eight Different Renter Household Profiles

Figure 5 illustrates how this works using the VMT Model, generating VMT estimates for the eight profiles.

Figure 6: Method for Estimating VMT for Eight Different Renter Household Profiles

Integrated geographic information systems (IGIS) analysis and definition of the tectonic framework of northern Mexico

Crustal rupture structures reactivated in the course of the tectonic history of northern Mexico are the surface expressions of planes of weakness, in the form of simple or composite rectilinear features or slightly curved, defined as lineaments. Unless otherwise defined as strike-slip faults, lineaments are part of parallel and sub-parallel oblique convergent or oblique divergent tectonic zones cross cutting the Sierra Madre Occidental and northern Mexico, in a NW trend. These shear zones are the response to the oblique subduction of the Farallon plate beneath North America. Kinematic analysis of five selected sites in northern Mexico, three basins and two compressional shear zones, proved possible a combination of shear mechanism diagram and models from analogue materials, with satellite imagery and geographic information systems, as an aid to define strike-slip fault motion. This was done using a reverse engineering process by comparing geometries. One of the sites assessed, involving the Parras Basin, Coahuila Block (CB), San Marcos fault, a postulated PBF-1 fault, allowed for palinpastic reconstruction of the CB that corroborated the results of the vector motion defined, in addition to an extension of ∼25% in a northwest southeast direction. A GIS-based compilation and georeferenced regional structural studies by several researchers were used as ground control areas (GCA) their interpolation and interpretation, resulted in a tectonic framework map of northern Mexico. In addition, shaded relief models overlaid by the lineaments / fault layer allowed structural analyses of basins related to these major structures. Two important results were obtained from this study: the Tepehuanes-San Luis-fault (TSL) and the Guadalupe fault, named herein, displaces the Villa de Reyes graben, and the Aguascalientes graben, respectively, to the SE, confirming their left lateral vector motion afterwards TSL was displaced south by the right lateral strike slip Taxco-San Miguel de Allende fault. The second result refers to the hypothesis that the Mesa Central was brought to its present location by a subduction zone located to the north. This subduction zone coincides with several researchers who postulated the idea. The compressional zones refer to segments of the Sinforosa and a postulated Aquinquari fault located in the stratotectonic Guerrero Terrane regarded as a highly mineralized zone. Negative anomalies near -200 milligals are strongly suggestive of a cratonic block identified in western Chihuahua, it being named the Western Chihuahua Cratonic Block (WCCB). In the southwestern portion of the North American craton the age provinces are well documented, but the block versus mobile belt idea has not been put forth or emphasized. The present study combines data of several types, sedimentological, structural, igneous geochemistry, and geochronologic data to evaluate this behavior in SW NA, and the proposed block is tested against these data. The presence of the WCCB is supported by a wide variety of data. Basins, troughs, aulacogens, bimodal volcanism, and other rift and rift shoulder features, characterize the spatially constrained mobile belts. Mobile belts surrounding the WCCB contain geologic records of the events going back to 1.4 Ga, with different aspects being dominant over geologic time. Mobile belts will participate in compression,(subduction), extension (rifting), and transform (lateral) faulting. The WCCB may have been derived from closely, adjacent, North American craton by mobile belt action. This study has shown that integration of data is essential, because allows detection of differences in hypotheses for the same event in the same area. This integration capability is what makes integrated geographic information systems a powerful tool, not only for their synergy, but because they can be combined with specific techniques that provide data before going to conduct fieldwork. Whether the issue of defining the tectonic framework of northern Mexico can be resolved or not, depends on the viability of integrating volumes of data from research, hypotheses, or maps, and put together under the same geographic frame.


In the last 15 years, public health researchers have documented disparities in health status associated with the structural and social characteristics of neighborhoods that cannot be explained by individual differences in risk profiles. A broad range of health outcomes has been considered in neighborhood research including indices of adult physical health [1], [2], [3], [4], [5], [6], [7], [8], [9], adult mental health [10], [11], [12], [13], [14], [15], [16], and child health [6], [7], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28].

Health outcome data at smaller geographic resolution (for example spatially referenced individual level data) are becoming increasingly available, furthering the study of neighborhood effects on health. Unfortunately, secondary data sources, such as the census, may be inadequate for these studies because aggregating over census geographies (e.g. blockgroups, tracts, etc) loses much of the variation that is valuable for analysis of individual level data. To address the limitations of secondary data sources, a number of public health researchers have employed community audits to provide not only current data on neighborhoods but also direct observation of neighborhood conditions.

Two recent reviews of the use of neighborhood observation methods in public health research [29], [30] document the wide range of approaches used in the implementation of community audit methods, and the limitations of the extant literature on these methods. Spatial resolution is one of the key factors along which built environment data differ. There are many reasons for this. First, if secondary data is used, the researcher must adjust and use the best spatial resolution available. In the absence of other available data, some information about the built environment is certainly better than none and some policy questions can be broadly assessed without high-resolution data. Second, data at higher spatial resolution is more expensive to obtain because a larger number of observations per area are needed. Costs and benefits of collecting and analyzing high-resolution vs lower resolution observational data should be considered. Additionally, the scale of the research question sometimes determines the resolution of data used. It is impractical for researchers who are studying the impact of a nation or state wide program to attempt to use the method presented here to collect high-resolution observational data. However, research studies focusing on smaller geographies (within a neighborhood or a city –sector) might find high-resolution data more beneficial. In short, the research question should determine the spatial resolution needed.

With this in mind, there is a need to examine the reliability and predictive utility of observational data collected at varying levels of aggregation. Examining the robustness of the relationship between individual outcome measures and observational data at different geographic aggregation levels is essential for investigating the presence of potential omitted variable biases that result when geographic boundaries are correlated with unobserved individual characteristics. Additionally, policy initiatives that operate at varying geographic scope would be better informed if the robustness of results to lower (and higher) level aggregation schemes were evaluated. However, due to the level of spatial aggregation at which the observational data are often collected, most of the existent data sets are not well suited for a thorough investigation of this robustness. In this paper, we present a new interdisciplinary community audit methodology for collecting data on neighborhood factors at the smallest geographic unit possible and assess the advantages of using data at such a high spatial resolution for micro-level studies in neighborhood-related health outcomes.

Previous work in this arena, such as the extensive data collected in conjunction with the Project on Human Development in Chicago Neighborhoods (PHDCN) and many others, has focused on the face block—or the parcels facing the same street segment or “block”– as the smallest geographic level for collecting observational data, but our methodology increases the spatial resolution by analyzing the parcel. In residential neighborhoods, a parcel consists of the house and surrounding yard or all of the property that a homeowner owns and is assessed for tax purposes. Most neighborhood audit tools measure some elements of both the physical and social conditions in the neighborhood. The parcel observation methodology we present focuses exclusively on the physical dimension. Raudenbush and Sampson [31] analyzed PHDCN data and find that measures of physical disorder are relevant gauges of neighborhood condition at higher spatial resolution than measures of social disorder. Further, they note that reliability among face block measures of physical disorder may be improved by increasing the number of items observed for each face block. Collecting parcel observations achieves this goal. Beyond the ecometrics-based rational put forth by Raudenbush and Sampson [31], there are also numerous practical reasons for expanding data collection to include parcel observations: (1) it allows for maximum flexibility with regards to spatial aggregation (2) it allows the researcher to distinguish between observations which the household has direct control (i.e. the upkeep of their yard/property) and observations which are impacted by others in the community (i.e. the upkeep of common areas such as parks or the upkeep of other properties in the neighborhood) and (3) it allows for the data and research outcomes to be related to property values which have direct policy impact through the tax base. However, while the advantages appear strong, no systematic work has been done to determine whether the advantages outweigh the costs of collecting data at this micro level.

An issue related to geographic aggregation is the problem of how to operationally define neighborhood boundaries—or put more generally, the modifiable areal unit problem (MAUP) [32]. When addressing the question about how to measure neighborhood characteristics, Guo and Bhat [33] state that “we should measure what matters to people over the area that really matters to people” (p. 31). This suggests neighborhood boundaries should be selected thoughtfully and may vary depending upon the research question at hand. To date, a few investigators studying the effects of neighborhood context on health have utilized sophisticated spatial definitions of neighborhoods [34], [35], [36], [37], [38], but no studies have been able to comprehensively compare the utility of community audit data collected at varying levels of aggregation. This is a significant gap in the extant literature. Boone et al [39] find that associations between physical activity and street connectivity vary by setting and geographic scale. The same is likely true of associations between health outcomes and neighborhood observational data. Most often, however, public health researchers rely upon administrative boundaries of neighborhoods such as census tracts or census block groups, and only a single geographic scale is analyzed.

The observational instrument developed is intended to be useful for analyzing the relationship between place and individual health and well-being while avoiding biased created by the MAUP. Can data at a high spatial resolution improve studies and, in particular, public health policy implications regarding the relationship between neighborhood conditions and health? While we acknowledge that data at lower spatial resolution is helpful and sufficient in some cases, we believe that higher resolution does present advantages that can improve and fine-tune policy implications. First, data at high spatial resolution allow the researcher nearly complete flexibility in specifying neighborhood boundaries and hence a thorough investigation of MAUP-bias is possible (though we note that a thorough analysis of this issue is outside the scope of this paper). For example, studies may find a statistically significant correlation between average census tract condition and obesity. However, unless one investigates this relationship further at varying levels of geographic resolution, it is unknown if the results are biased by other omitted variables correlated with the geographic census tract definition. Second, data at high spatial resolution allow public health policy makers to identify the most appropriate geographic level for public health interventions. Continuing with the previous example, a relationship between census tract condition and health is an important observation, but policy relevance may be greatly improved if additional insight was available on the geographic scale that scarce public resources should be deployed to improve public health. Policy makers need to know the comparative implications for enacting, for example, broad-based local neighborhood clean-up initiatives throughout the census tract, versus concentrated initiatives to improve only the most blighted areas within the census tract.. Neighborhood observational data will allow the research to specify exactly which geographic definitions matter most for a particular policy implication.

The purpose of this report is three-fold. First, we describe the methodology and how the method was implemented to ensure the collection of high quality observational data. Next, we analyze the reliability of the data and the relationships found between the variables observed. Finally, we examine the utility of these data to examine neighborhood conditions at different levels of aggregation and how such data might be used in studies of neighborhoods and health.

Modernizing the U.S. Census (1995)

APPENDIX FBusiness Uses of Census Data

Census data are used by many in the private sector, by for-and not-for-profit organizations. Retail establishments and restaurants, banks and other financial institutions, media and advertising, insurance companies, utility companies, health care providers, and many other segments of the business world use census data. In the past, household-level data on consumers at the zip-code and census-tract levels have been classified by characteristics such as age, sex, and income. Increasingly, however, individual households are contacted by direct mail or some other type of direct media (e.g., newspaper inserts). This appendix contains some examples of uses of census data by various segments of the business community&mdashevidence of the great use of census data by businesses is provided by the Division of Research and Statistical Services of the South Carolina Budget and Control Board (the state data center for census information in South Carolina), which estimated that 35 percent of the annual requests received for census data are from businesses.

Small-area data are important for many business applications. Some businesses use small-area data as a substitute for household-level data. More important, however, is the ability to aggregate the small-area census data to nonstandard geographic areas&mdashfor example for business trade areas. As long as these data are available, businesses can create aggregations of data into areas for which data have never been published. The smaller the level of geography for which data are available, the more creatively businesses can create aggregations and the more precisely they can define the geographic area.

Several key types of small-area census data (at tract and other geographic levels) are used for business purposes: age education employment housing

unit age, tenure, heating fuel, type, value, and rent income occupation persons in household phone availability race and ethnicity commute to work and vehicles per household. Products&mdashsuch as maps showing the concentration of a specific racial or ethnic group by specific areas (e.g., county, census tract, or zip code) or maps showing moderate-, high-, or low-income areas&mdashcan be produced using census data. Data also can be used to create consumer profiles, which can help in targeting advertising to current and potential customers finding new customers and analyzing locations, selecting sites, and competing against other businesses in a market area. Both the maps and consumer profiles (which may also be linked to a map) are used by businesses to target their markets more effectively. As the use of geographic information systems has grown, the demand for small-area geographic data has also grown. And, in turn, the new-found congruence of accessible geographically referenced small-area data is promoting the use of small-area census data for business purposes even further.

For example, a retail corporation with plans to expand could analyze potential markets before selecting sites. A specific case study (Kintner et al., 1994) involved examining and assessing various markets for a corporation's planned expansion. Several potential markets were selected by the corporation for the expansion, and the corporation wanted to determine which of the potential markets would be the most successful. Although the company's staff would make the final decision about the exact location of the sites, consultants were hired to analyze the potential revenue for each market. First, the consultants developed a model for analyzing the potential markets. The model took into account a number of variables&mdashsuch as population, number of firms employing 100 or more workers, number of vehicles entering the county, and size of the transient population&mdashthat could help predict the viability of a site in areas selected for analysis. Some data were from business sources, but census data provided an essential component for analysis. Information on existing markets was used in the model to help determine its accuracy. Then, the predicted revenues for each of the existing locations generated by the model were compared with the actual revenues of those markets, enabling the corporation to assess and identify the strengths and weaknesses of the model. Next, data were collected for the potential new markets. By adding the new data to the model, revenue estimates were created for the potential markets, and the markets were ranked based on their predicted revenue. Markets that were the most promising were selected for additional analyses and reviewed by the corporation's staff, who then were able to select the best markets for the corporation's expansion.

It is clear to the panel that businesses use small-area data creatively and effectively for a number of applications, and that small-area census data are important to those applications. However, it is difficult to foresee the effects of a loss of small-area census data. There could be a negative impact on efficiency and competitiveness&mdashimpacts that would be difficult to measure.

This appendix describes the business uses of census data for a variety of

industries, including retail and restaurant, banks and other financial institutions, media and advertising, insurance, utilities, health care, nonprofit, and others. The review is not exhaustive of all industries, nor comprehensive in the many ways that census data are used. Rather, the purpose is to highlight several common uses, for a variety of industries, to illustrate the specific ways census data are used to reach business decisions and to improve business marketing. The examples cited below are taken from Thomas and Kirchner (1991), a recent publication on desktop marketing that describes ways that demographic data are used by businesses.


Retail and service businesses, such as restaurants, use data to decide where to locate their stores and how to effectively market their goods and services. A retail chain might use population, poverty, income, and labor-force data for a state and for a city or county to study the possibility of a retail outlet. For example, county-level population figures for women aged 16-34 years could be used to help determine the location for a maternity shop. Or a children's clothing retailer could use age data, income data, and retail statistics to select a location.

A fast-food restaurant chain was able to better target employee recruiting efforts and improve service by analyzing concentrations of the population with desirable employee traits/lifestyle characteristics (including longevity of employment). To accomplish this task, the restaurant chain identified the characteristics of its current base of employees and located areas with high concentrations of potential employees&mdasha population whose characteristics were the same as the most successful current employees. In creating the profile of current employees, past and present employees with at least 6 months of service with the restaurant were categorized into 1 of 50 categories based on census block-group characteristics of their neighborhoods. The categories were charted according to the percentage of total current employees falling into the category. Using the data, the restaurant was able to identify categories of workers that were likely to become restaurant employees and determine areas where they lived. Recruiting efforts were targeted to those areas using mail and newspaper advertisements, among other techniques. The restaurant has found this ability to be useful in existing markets and new markets, and it has helped reduce turnover in the restaurants, resulting in improved customer service (Thomas and Kirchner, 1991:55-60).

For selecting restaurant sites, a general area, as well as specific sites for the restaurant can be evaluated. By looking at selected demographic data by specific levels of geography (e.g., counties and zip codes) the characteristics of the potential customers can be determined. Employment data at those same levels may also be evaluated. These analyses taken together can help the restaurateur

select the best site for a successful restaurant (Thomas and Kirchner, 1991:61-63).


Like retailers and restaurateurs, banks and other financial institutions can select the best locations for branch offices by analyzing population, demographic, and economic data from the census. More importantly, however, banks and financial institutions require median household income and income distributions by census tracts to ensure compliance with federal mortgage lending guidelines regarding race, and for meeting other regulatory requirements, particularly the Community Reinvestment Act, Home Mortgage Disclosure Act, and the Federal Insurance Improvement Act of 1992.

For example, the Community Reinvestment Act mandates that financial institutions meet deposit and credit needs in the communities they serve. The federal agencies that supervise financial institutions are required to assess whether the financial institutions in an area are meeting the needs of the community. To assess its compliance with the mandates of the Community Reinvestment Act, a bank wanted to determine the ratio of its loans to its deposits. Using customer data and a software system that is able to link demographic and client information, the bank was able to determine the loan-to-deposit ratio for its service areas. Thus, the bank was able to assess itself whether it was complying with the Community Reinvestment Act before the regulatory agencies conducted their audits. If there were areas with a discrepancy between deposits and loans, the bank would be able to make corrections in those areas (Thomas and Kirchner, 1991:114-116).

Census data can be used by banks to develop locally focused marketing programs. For example, a bank can determine the potential success of a particular new service by looking at how and where to market the service. A demographic profile of service areas based on age, deposits, household income, and credit use can be created. By grouping and mapping the frequency of the four variables mentioned above, along with a consumer profile, areas where the service is likely to be used can be identified. Those areas then can be targeted for promotion and implementation of the new service (Thomas and Kirchner, 1991:93-97).

In trying to determine if acquiring a competing banking institution (Bank B) would be a feasible and profitable way to expand and diversify its services, Bank A wanted to assess the proximity of Bank B's branches to its existing branches, the comparability of existing customers of Bank A with Bank B, and the comparability of services offered by both banks. The population (current and future projected) of the areas surrounding branches was compared, and income estimates for Bank B's locations were analyzed by census tract level (Thomas and

Kirchner, 1991:102-108). Using these analyses, Bank A is able to make the best decision about acquiring the competing bank.

A bank can analyze the potential performance of new and existing markets by developing a profile for evaluating those markets. By combining demographic characteristics of data on national financial behavior with demographic data for a particular market, a profile of the bank's service area can be developed. Using the average state performance of branches as a benchmark, the bank can determine the amount of increased business for areas performing below the state average if those areas grow to the state average level. This can help the bank determine areas for increased market analysis and marketing efforts, while also pinpointing markets that are performing at or above the state average that need to be maintained and protected from competitors (Thomas and Kirchner, 1991:111-113).


Newspapers use census data in stories to profile the demographics of blocks, neighborhoods, towns, cities, counties, states, and other geographic areas. Census data also provide demographic background for other stories of general and specific interest to the public, e.g., what are the socioeconomic characteristics of areas with the most lawlessness in the Los Angeles riots? What is the most ''middle class" tract in L.A. County? And what are commuter travel patterns in Orange County? Examples included in responses to the panel's survey of state data centers (see Appendix E) noted that all variables to the block-group level in various census geographic files can be used to describe the demographic and economic characteristics of places and areas. Also reported in a survey response was that the Los Angeles Times recently used 1990 census data in more than 300 news stories within one year.

The collection of consumer zip codes may be used to create a consumer profile for an area. For instance, a radio station might collect a caller's zip code and link it to demographic data to develop profiles of listener preferences (Thomas and Kirchner, 1991:34). In turn, the station can determine the potential success of a particular radio format for a given area and target marketing campaigns accordingly. Those profiles can also be linked with ratings information and used to optimize advertising revenue.

A cable television company analyzed purchase of pay-per-view events by census tract maps (Thomas and Kirchner, 1991:37) and created customer profiles by block-group level. Those customer profiles assisted the company in focussing its marketing efforts to specific customers. For example, pay-per-view sporting events can be marketed to the subscribers that are most likely to purchase the event, rather than to the entire customer base, thus increasing the advertising value.


In a case study, an insurance company wanted to determine if some of its offices had allowed policies to lapse more than others. The company first wanted to determine if sites with high lapse rates were located in areas with high-risk customers. To determine the different characteristics between lapsed customers and continuing customers, the company created a profile of current customers, as well as a profile of lapsed customers. Based on the profile, the company determined that the continuing customers were generally more affluent and more family-oriented. When the profile for continuing customers was compared to the profile for lapsed customers, the company found that lapsed customers "tended to be more downscale than average" (Thomas and Kirchner, 1991:119). Using the data, the company was able to estimate what the performance of various offices should be, based on their geographic locations. For example, some of the offices were located in areas where the population could be characterized as high-lapse customers. Those offices, it was determined, could expect lower overall performance (Thomas and Kirchner, 1991:117-121).


Utility companies use census data to target low-income areas or areas with special needs, as well as for market research. Most utility companies have special lower rates for poorer, elderly, or disabled customers. Census data help companies note special areas for individual contact and special services and rates. An electric or gas company can use customer records to determine their share of the market. Using customer address information, a utility company can determine areas where it might be desirable to increase customer volume through greater name recognition. Other companies are using census maps to plot the location of their utility lines so they can quickly reference the proximity of lines to population areas.


Health care providers use census data to determine the need for additional hospital services, physicians, urgent care facilities, or other type of medical services in an area. For example, a hospital used data to study population trends when looking into building an off-site facility in a rural area, so that better health care could be provided to residents in that area. Using characteristics such as race, age, sex, and income for the health service area, a provider can determine if there is a need for additional doctors or other health services in an area. By estimating the need for services in an area, the best site for a doctor's office can be determined (Thomas and Kirchner, 1991:130-136). A hospital's selection of urgent care center sites is aided by analyzing patient records (including address

Watch the video: Φτιαξτε ευκολα κεραλοιφη