Data Science & Spatial Analysis Archives - Page 5 of 9

Development of a Regional Lidar-Derived Above-Ground Biomass Model with Bayesian Model Averaging for Use in Ponderosa Pine and Mixed Conifer Forests in Arizona and New Mexico, USA

Tenneson, Karis; Patterson, Matthew S.; Mellin, Thomas; Nigrelli, Mark; Joria, Peter; Mitchell, Brent. (2018). Development of a Regional Lidar-Derived Above-Ground Biomass Model with Bayesian Model Averaging for Use in Ponderosa Pine and Mixed Conifer Forests in Arizona and New Mexico, USA. Remote Sensing, 10(3).

View Publication

Abstract

Historical forest management practices in the southwestern US have left forests prone to high-severity, stand-replacement fires. Reducing the cost of forest-fire management and reintroducing fire to the landscape without negative impact depends on detailed knowledge of stand composition, in particular, above-ground biomass (AGB). Lidar-based modeling techniques provide opportunities to increase ability of managers to monitor AGB and other forest metrics at reduced cost. We developed a regional lidar-based statistical model to estimate AGB for Ponderosa pine and mixed conifer forest systems of the southwestern USA, using previously collected field data. Model selection was performed using Bayesian model averaging (BMA) to reduce researcher bias, fully explore the model space, and avoid overfitting. The selected model includes measures of canopy height, canopy density, and height distribution. The model selected with BMA explains 71% of the variability in field-estimates of AGB, and the RMSE of the two independent validation data sets are 23.25 and 32.82 Mg/ha. The regional model is structured in accordance with previously described local models, and performs equivalently to these smaller scale models. We have demonstrated the effectiveness of lidar for developing cost-effective, robust regional AGB models for monitoring and planning adaptively at the landscape scale.

Keywords

Laser Scanner Data; Landscape Restoration Program; Canopy Fuel Parameters; Discrete-return Lidar; Western United-states; Wave-form Lidar; Airborne Laser; Tropical Forest; Climate-change; Adaptive Management; Forest Biomass; Aboveground Biomass; Airborne Lidar; Monitoring; Regional Forest Inventory; Variable Selection; Bayesian Model Averaging; Multiple Linear Regression

Built Environment Factors in Explaining the Automobile-Involved Bicycle Crash Frequencies: A Spatial Statistic Approach

Chen, Peng. (2015). Built Environment Factors in Explaining the Automobile-Involved Bicycle Crash Frequencies: A Spatial Statistic Approach. Safety Science, 79, 336 – 343.

View Publication

Abstract

The objective of this study is to understand the relationship between built environment factors and bicycle crashes with motor vehicles involved in Seattle. The research method employed is a Poisson lognormal random effects model using hierarchal Bayesian estimation. The Traffic Analysis Zone (TAZ) is selected as the unit of analysis to quantify the built environment factors. The assembled dataset provides a rich source of variables, including road network, street elements, traffic controls, travel demand, land use, and socio-demographics. The research questions are twofold: how are the built environment factors associated with the bicycle crashes, and are the TAZ-based bicycle crashes spatially correlated? The findings of this study are: (1) safety improvements should focus on places with more mixed land use; (2) off-arterial bicycle routes are safer than on-arterial bicycle routes; (3) TAZ-based bicycle crashes are spatially correlated; (4) TAZs with more road signals and street parking signs are likely to have more bicycle crashes; and (5) TAZs with more automobile trips have more bicycle crashes. For policy implications, the results suggest that the local authorities should lower the driving speed limits, regulate cycling and driving behaviors in areas with mixed land use, and separate bike lanes from road traffic. (C) 2015 Elsevier Ltd. All rights reserved.

Keywords

Injury Crashes; Risk Analysis; Models; Infrastructure; Dependence; Counts; Level; Bicycle Crash Frequency; Hierarchal Bayesian Estimation; Poisson Lognormal Random Effects Model; Built Environment; Traffic Analysis Zone

Estimating Traffic Volume for Local Streets with Imbalanced Data

Chen, Peng; Hu, Songhua; Shen, Qing; Lin, Hangfei; Xie, Chi. (2019). Estimating Traffic Volume for Local Streets with Imbalanced Data. Transportation Research Record, 2673(3), 598 – 610.

View Publication

Abstract

Annual average daily traffic (AADT) is an important measurement used in traffic engineering. Local streets are major components of a road network. However, automatic traffic recorders (ATRs) used to collect AADT are often limited to arterial roads, and such information is, therefore, often unavailable for local streets. Estimating AADT on local streets becomes a necessity as local street traffic continues to grow and the capacity of arterial roads becomes insufficient. A challenge is that an under-represented sample of local street AADT may result in biased estimation. A synthetic minority oversampling technique (SMOTE) is applied to oversample local streets to correct the imbalanced sampling among different road types. A generalized linear mixed model (GLMM) is employed to estimate AADT incorporating various independent variables, including factors of roadway design, socio-demographics, and land use. The model is examined with an AADT dataset from Seattle, WA. Results show that: (1) SMOTE helps to correct imbalanced sampling proportions and improve model performance significantly; (2) the number of lanes and the number of crosswalks are both positively associated with AADT; (3) road segments located in areas with a higher population density or more mixed land use have a higher AADT; (4) distance to the nearest arterial road is negatively correlated with AADT; and (5) AADT creates spatial spillover effects on neighboring road segments. The combination of SMOTE and GLMM improves the estimation accuracy on AADT, which contributes to better data for transportation planning and traffic monitoring, and to cost saving on data collection.

Keywords

Average; Prediction; Network; County

Phasic Metropolitan Settlers: A Phase-Based Model for the Distribution of Households in US Metropolitan Regions

Estiri, Hossein; Krause, Andy; Heris, Mehdi P. (2015). Phasic Metropolitan Settlers: A Phase-Based Model for the Distribution of Households in US Metropolitan Regions. Urban Geography, 36(5), 777 – 794.

View Publication

Abstract

In this article, we develop a model for explaining spatial patterns in the distribution of households across metropolitan regions in the United States. First, we use housing consumption and residential mobility theories to construct a hypothetical probability distribution function for the consumption of housing services across three phases of household life span. We then hypothesize a second probability distribution function for the offering of housing services based on the distance from city center(s) at the metropolitan scale. Intersecting the two hypothetical probability functions, we develop a phase-based model for the distribution of households in US metropolitan regions. We argue that phase one households (young adults) are more likely to reside in central city locations, whereas phase two and three households are more likely to select suburban locations, due to their respective housing consumption behaviors. We provide empirical validation of our theoretical model with the data from the 2010 US Census for 35 large metropolitan regions.

Keywords

Residential-mobility; Life-course; Housing Consumption; Family; Satisfaction; Migration; Geography; Context; Age; Distribution Patterns; Us Metropolitan Regions; Household

Intersections and Non-Intersections: A Protocol for Identifying Pedestrian Crash Risk Locations in GIS

Kang, Mingyu; Moudon, Anne Vernez; Kim, Haena; Boyle, Linda Ng. (2019). Intersections and Non-Intersections: A Protocol for Identifying Pedestrian Crash Risk Locations in GIS. International Journal Of Environmental Research And Public Health, 16(19).

View Publication

Abstract

Intersection and non-intersection locations are commonly used as spatial units of analysis for modeling pedestrian crashes. While both location types have been previously studied, comparing results is difficult given the different data and methods used to identify crash-risk locations. In this study, a systematic and replicable protocol was developed in GIS (Geographic Information System) to create a consistent spatial unit of analysis for use in pedestrian crash modelling. Four publicly accessible datasets were used to identify unique intersection and non-intersection locations: Roadway intersection points, roadway lanes, legal speed limits, and pedestrian crash records. Two algorithms were developed and tested using five search radii (ranging from 20 to 100 m) to assess the protocol reliability. The algorithms, which were designed to identify crash-risk locations at intersection and non-intersection areas detected 87.2% of the pedestrian crash locations (r: 20 m). Agreement rates between algorithm results and the crash data were 94.1% for intersection and 98.0% for non-intersection locations, respectively. The buffer size of 20 m generally showed the highest performance in the analyses. The present protocol offered an efficient and reliable method to create spatial analysis units for pedestrian crash modeling. It provided researchers a cost-effective method to identify unique intersection and non-intersection locations. Additional search radii should be tested in future studies to refine the capture of crash-risk locations.

Keywords

Traffic Crash; Walking; Collisions; Accidents; Models; Pedestrian Safety; Spatial Autocorrelation; Algorithm

Split-Match-Aggregate (SMA) Algorithm: Integrating Sidewalk Data with Transportation Network Data in GIS

Kang, Bumjoon; Scully, Jason Y.; Stewart, Orion; Hurvitz, Philip M.; Moudon, Anne V. (2015). Split-Match-Aggregate (SMA) Algorithm: Integrating Sidewalk Data with Transportation Network Data in GIS. International Journal Of Geographical Information Science, 29(3), 440 – 453.

View Publication

Abstract

Sidewalk geodata are essential to understand walking behavior. However, such geodata are scarce, only available at the local jurisdiction and not at the regional level. If they exist, the data are stored in geometric representational formats without network characteristics such as sidewalk connectivity and completeness. This article presents the Split-Match-Aggregate (SMA) algorithm, which automatically conflates sidewalk information from secondary geometric sidewalk data to existing street network data. The algorithm uses three parameters to determine geometric relationships between sidewalk and street segments: the distance between streets and sidewalk segments; the angle between sidewalk and street segments; and the difference between the lengths of matched sidewalk and street segments. The SMA algorithm was applied in urban King County, WA, to 13 jurisdictions' secondary sidewalk geodata. Parameter values were determined based on agreement rates between results obtained from 72 pre-specified parameter combinations and those of a trained geographic information systems (GIS) analyst using a randomly selected 5% of the 79,928 street segments as a parameter-development sample. The algorithm performed best when the distances between sidewalk and street segments were 12m or less, their angles were 25 degrees or less, and the tolerance was set to 18m, showing an excellent agreement rate of 96.5%. The SMA algorithm was applied to classify sidewalks in the entire study area and it successfully updated sidewalk coverage information on the existing regional-level street network data. The algorithm can be applied for conflating attributes between associated, but geometrically misaligned line data sets in GIS.

Keywords

Geodatabases; Sidewalks; Algorithms; Pedestrians; Digital Mapping; Algorithm; Gis; Pedestrian Network Data; Polyline Conflation; Sidewalk; Built Environment; Physical-activity; Mode Choice; Urban Form; Land-use; Travel; Generation; Walking

Quantifying Economic Effects of Transportation Investment Considering Spatiotemporal Heterogeneity in China: A Spatial Panel Data Model Perspective

Lin, Xiongbin; Maclachlan, Ian; Ren, Ting; Sun, Feiyang. (2019). Quantifying Economic Effects of Transportation Investment Considering Spatiotemporal Heterogeneity in China: A Spatial Panel Data Model Perspective. The Annals Of Regional Science, 63(3), 437 – 459.

View Publication

Abstract

Transportation investment plays a significant role in promoting economic development. However, in what scenario and to what extent transportation investment can stimulate economic growth still remains debatable. For developing countries undergoing rapid urbanization, answering these questions is necessary for evaluating proposals and determining investment plans, especially considering the heterogeneity of spatiotemporal conditions. Current literature lacks systematical research to consider the impacts of panel data and spatial correlation issue in examining the economic effects of transportation investment. To fill this gap, this study collects provincial panel data in China from 1997 to 2015 to evaluate multi-level temporal and spatial effects of transportation investment on economic growth by using spatial panel data analysis. Results show that transportation investment leads to significant and positive effects on growth and spatial concentration of economic activities, but these results vary significantly depending on the temporal and spatial characteristics of each province. The economic impacts of transportation investment are quite positive even considering the time lag effects. This study suggests that both central and local governments should carefully evaluate the multifaceted economic effects of transportation investment, such as a balanced transportation investment and economic development between growing and lagging regions, and considering the spatiotemporal heterogeneity of the economic environment.

Keywords

High-speed Rail; Infrastructure Investment; Causal Relationship; Empirical-analysis; Growth; Impact; Productivity; Efficiency; Spillover; Agglomeration; C33; R40; R58; Spatial Analysis; Time Lag; Urbanization; Transportation; Heterogeneity; Economic Growth; Economic Models; Economic Impact; Data Analysis; Spatial Data; Panel Data; Economic Development; Developing Countries--ldcs; Investments; Economic Analysis; Investment; Local Government; China

Domain Knowledge-Based Information Retrieval for Engineering Technical Documents

Shang-hsien Hsieh; Ken-yu Lin; Nai-wen Chi; Hsien-tang Lin. (2015). Domain Knowledge-Based Information Retrieval for Engineering Technical Documents. Ontology In The AEC Industry. A Decade Of Research And Development In Architecture, Engineering And Construction, chapter 1.

View Publication

Abstract

Technical documents with complicated structures are often produced in architecture/engineering/construction (AEC) projects and research. Information retrieval (IR) techniques provide a possible solution for managing the ever-growing volume and contexts of the knowledge embedded in these technical documents. However, applying a general-purpose search engine to a domain-specific technical document collection often produces unsatisfactory results. To address this problem, we research the development of a novel IR system based on passage retrieval techniques. The system employs domain knowledge to assist passage partitioning and supports an interactive concept-based expanded IR for technical documents in an engineering field. The engineering domain selected in this case is earthquake engineering, although the technologies developed and employed by the system should be generally applicable to many other engineering domains that use technical documents with similar characteristics. We carry out the research in a three-step process. In the first step, since the final output of this research is an IR system, as a prerequisite, we created a reference collection which includes 111 earthquake engineering technical documents from Taiwan's National Center for Research on Earthquake Engineering. With this collection, the effectiveness of the IR system can be further evaluated onceit is developed. In the second step, the research focuses on creating a base domain ontology using an earthquake-engineering handbook to represent the domain knowledge and to support the target IR system with the knowledge. In step three, the research focuses on the semantic querying and retrieval mechanisms and develops the OntoPassage approach to help with the mechanisms. The OntoPassage approach partitions a document into smaller passages, each with around 300 terms, according to the main concepts in the document. This approach is then used to implement the target domain knowledge-based IR system that allows users to interact with the system and perform concept-based query expansions. The results show that the proposed domain knowledge-based IR system can achieve not only an effective IR but also inform search engine users with a clear knowledge representation.

Keywords

Architecture; Construction; Engineering; Knowledge Based Systems; Ontologies (artificial Intelligence); Query Processing; Search Engines; Knowledge Representation; Concept-based Query Expansions; Base Domain Ontology; Earthquake Engineering; General-purpose Search Engine; Aec Projects; Architecture/engineering/construction Projects; Complicated Structures; Technical Documents; Domain Knowledge-based Information Retrieval

A Tutorial on Dynasearch: A Web-Based System for Collecting Process-Tracing Data in Dynamic Decision Tasks

Lindell, Michael K.; House, Donald H.; Gestring, Jordan; Wu, Hao-Che. (2019). A Tutorial on Dynasearch: A Web-Based System for Collecting Process-Tracing Data in Dynamic Decision Tasks. Behavior Research Methods, 51(6), 2646 – 2660.

View Publication

Abstract

This tutorial describes DynaSearch, a Web-based system that supports process-tracing experiments on coupled-system dynamic decision-making tasks. A major need in these tasks is to examine the process by which decision makers search over a succession of situation reports for the information they need in order to make response decisions. DynaSearch provides researchers with the ability to construct and administer Web-based experiments containing both between- and within-subjects factors. Information search pages record participants' acquisition of verbal, numeric, and graphic information. Questionnaire pages query participants' recall of information, inferences from that information, and decisions about appropriate response actions. Experimenters can access this information in an online viewer to verify satisfactory task completion and can download the data in comma-separated text files that can be imported into statistical analysis packages.

Keywords

Downloading; Text Files; Tasks; Access To Information; Statistics; Dynamic Decision Making; Process Tracing; Web-based Experiments; Information Search; Human-behavior; Eye-tracking; Choice; Expectations; Strategies; Mousetrap; Software; Time

Push, Pull, and Spill: A Transdisciplinary Case Study in Municipal Open Government

Whittington, Jan; Calo, Ryan; Simon, Mike; Jesse Woo; Meg Young; Schmiedeskamp, Peter. (2015). Push, Pull, and Spill: A Transdisciplinary Case Study in Municipal Open Government. Berkeley Technology Law Journal, 30(3), 1899 – 1966.

View Publication

Abstract

Municipal open data raises hopes and concerns. The activities of cities produce a wide array of data, data that is vastly enriched by ubiquitous computing. Municipal data is opened as it is pushed to, pulled by, and spilled to the public through online portals, requests for public records, and releases by cities and their vendors, contractors, and partners. By opening data, cities hope to raise public trust and prompt innovation. Municipal data, however, is often about the people who live, work, and travel in the city. By opening data, cities raise concern for privacy and social justice. This article presents the results of a broad empirical exploration of municipal data release in the City of Seattle. In this research, parties affected by municipal practices expressed their hopes and concerns for open data. City personnel from eight prominent departments described the reasoning, procedures, and controversies that have accompanied their release of data. All of the existing data from the online portal for the city were joined to assess the risk to privacy inherent in open data. Contracts with third parties involving sensitive or confidential data about residents of the city were examined for safeguards against the unauthorized release of data. Results suggest the need for more comprehensive measures to manage the risk latent in opening city data. Cities should maintain inventories of data assets, produce data management plans pertaining to the activities of departments, and develop governance structures to deal with issues as they arise--centrally and amongst the various departments--with ex ante and ex post protocols to govern the push, pull, and spill of data. In addition, cities should consider conditioned access to pushed data, conduct audits and training around public records requests, and develop standardized model contracts to protect against the spill of data by third parties. [ABSTRACT FROM AUTHOR]; Copyright of Berkeley Technology Law Journal is the property of University of California School of Law and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

Keywords

Public Records; Open Data Movement; Acquisition Of Data; Ubiquitous Computing; Data Analysis; Social Justice