• <div class="header-image" style="background-image: url(/live/image/gid/4/2897_V6N9_Header.rev.1540219621.jpg);">​</div><div class="header-background-color"/>

Machine Learning in the Regulatory Process

October 17, 2018
The rapid increase in the amount of recorded data has given governments access to more personal information than ever before. While governments receive an unprecedented amount of information, this does not necessarily make the regulation or supervision of this data any easier. Therefore, it is becoming ever more important to analyze and synthesize these large amounts of data in order to improve the government’s regulation and supervision. In light of this need, governments have begun to implement machine learning and big data analytics in order to help analyze all the incoming data and extract valuable insights.

For example, in Pittsburgh, the county government has digitized all of its records and is beginning to use big data analytical tools to improve its health and human services. [1] While analyzing public data may be nothing new, the scale and tools used to analyze it effectively and efficiently are expanding. Pittsburgh is now using machine learning to identify children who are more likely to suffer an injury or death because of potentially unsafe living conditions, whereas before it was done manually. [2]
Another example is using machine learning to determine which restaurants need hygiene inspections by using natural language processing on restaurant reviews. [3] Machine learning algorithms will also help governments better allocate its inspectors by improving operational efficiency.
However, the algorithms are not restricted to local issues. Machine learning algorithms are also being used to allocate refugees to the cities in which they would best be able to integrate themselves. [4] This algorithm has boosted refugee employment anywhere from 40-70% in cities around the world. [5]
Machine learning not only allows governments to analyze the large amount of data they collect much faster, but it also reduces the number of errors made. Furthermore, the machine learning algorithms may be able to detect patterns better than humans. [6] For example, it may be difficult for a human to detect all the relationships between variables in datasets with millions of data points. However, algorithms would have a much easier time handling this amount of data as well as finding the different relationships between variables. Additionally, humans can get tired and make sloppy mistakes whereas computers will not get tired. [7]
However, there are some complications that officials must consider as they begin to implement machine learning. While the algorithms may not tire, they also do not have a human mind that is able to recognize unusual data or understand when something does not look right in the data. For example, if there are negative values in a column with people’s ages, a computer would incorporate those negative values into the model it computes, but a human would know to either impute those values or exclude those observations from the model. Therefore, the need for good quality and consistency in data will be even more crucial as we begin to implement these algorithms more and more5. Additionally, the results of these algorithms are not infallible. Therefore, officials will also have to be mindful that the results of these algorithms actually make sense and that the relationships the machines find are causal, not just correlated. [8]
Before using these algorithms, it was easier to explain how certain conclusions were drawn and the analysis that went into them. Humans were usually involved at every or most steps and someone had to be making some decision that they needed to justify. Therefore, as long as their justifications were documented, it was pretty straightforward to explain why they did certain things and chose to not do others. In contrast, new algorithms are a black box that reveals nothing about how conclusions are made5. It is becoming harder and harder for humans to understand every step or decision the machine makes. Furthermore, these machines do not have any intuitive understanding of the data. They simply try to find the correlations and make predictions. These things make it harder for other people to buy into the algorithm’s findings and will be something officials need to deal with moving forward.

An illustration of how the way the machine algorithms work is a “black box.” By Docurbs - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=52379695

An illustration of how the way the machine algorithms work is a “black box.”
By Docurbs - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=52379695


There are some other issues that will need to be addressed going forward as well. Using an algorithm to improve public policy may seem unbiased on the outside, especially compared to issues we have seen in the past regrading officials and racial profiling. However, these machine learning algorithms can encode existing biases, for example, in racialized law enforcement. [9] One method to counter potential bias is to build the model to consider the differences between populations as well as avoid over-generalizations. [10] Another thing that officials will have to consider is potentially using different models for different populations. [11] A different option would be to make the algorithms have fewer rules for populations that are less well represented in the data. [12] This would make sense as the less represented populations in the data may not be as representative of the population as a whole, so you would not want to jump to as many conclusions based on a smaller sample size. While these issues must be dealt with in order to improve the algorithms, there is no correct answer, so officials will have to weigh the pros and cons of each of these options.

Different algorithms and assumptions made can yield different results. By Shiyu Ji - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=60632949

Different algorithms and assumptions made can yield different results.
By Shiyu Ji - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=60632949


In the end, once governments can leap over some of the present-day hurdles, the future benefits of comprehensive machine learning implementation could be tremendous. The applications of these algorithms are potentially limitless and will be able to improve many facets of public sector services.

Student Blog Disclaimer
  • The views expressed on the Student Blog are the author’s opinions and don’t necessarily represent the Penn Wharton Public Policy Initiative’s strategies, recommendations, or opinions.

 

References


  [1] https://www.theregreview.org/2017/10/11/cramer-machine-learning-public-services/
  [2] https://www.theregreview.org/2017/10/11/cramer-machine-learning-public-services/
  [3] https://www.datacareer.ch/blog/machine-learning-in-public-policy-examples-from-research/
  [4] https://www.datacareer.ch/blog/machine-learning-in-public-policy-examples-from-research/
  [5] https://www.datacareer.ch/blog/machine-learning-in-public-policy-examples-from-research/
  [6] https://philadelphiafed.org/-/media/bank-resources/supervision-and-regulation/events/2017/fintech/resources/18_slides_wall.pdf?la=en
  [7] https://philadelphiafed.org/-/media/bank-resources/supervision-and-regulation/events/2017/fintech/resources/18_slides_wall.pdf?la=en
  [8] https://philadelphiafed.org/-/media/bank-resources/supervision-and-regulation/events/2017/fintech/resources/18_slides_wall.pdf?la=en
  [9] https://www.theregreview.org/2017/10/02/schlabs-usefulness-dangers-machine-learning/
  [10] https://www.theregreview.org/2017/10/03/schlabs-machine-learning-fairness-justice/
  [11] https://www.theregreview.org/2017/10/02/schlabs-usefulness-dangers-machine-learning/
  [12] https://www.theregreview.org/2017/10/02/schlabs-usefulness-dangers-machine-learning/

PENN WHARTON PPI
RESOURCE SPOTLIGHT:

  • <h3>National Bureau of Economic Research (Public Use Data Archive)</h3><p><img width="180" height="43" alt="" src="/live/image/gid/4/width/180/height/43/478_nber.rev.1407530465.jpg" class="lw_image lw_image478 lw_align_right" data-max-w="329" data-max-h="79"/>Founded in 1920, the <strong>National Bureau of Economic Research</strong> is a private, nonprofit, nonpartisan research organization dedicated to promoting a greater understanding of how the economy works. The NBER is committed to undertaking and disseminating unbiased economic research among public policymakers, business professionals, and the academic community.</p><p> Quick Link to <strong>Public Use Data Archive</strong>: <a href="http://www.nber.org/data/" target="_blank">http://www.nber.org/data/</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>The World Bank Data (U.S.)</h3><p><img width="130" height="118" alt="" src="/live/image/gid/4/width/130/height/118/484_world-bank-logo.rev.1407788945.jpg" class="lw_image lw_image484 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/130/height/118/484_world-bank-logo.rev.1407788945.jpg 2x, /live/image/scale/3x/gid/4/width/130/height/118/484_world-bank-logo.rev.1407788945.jpg 3x" data-max-w="1406" data-max-h="1275"/>The <strong>World Bank</strong> provides World Development Indicators, Surveys, and data on Finances and Climate Change.</p><p> Quick link: <a href="http://data.worldbank.org/country/united-states" target="_blank">http://data.worldbank.org/country/united-states</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>USDA Nutrition Assistance Data</h3><p><img width="180" height="124" alt="" src="/live/image/gid/4/width/180/height/124/485_usda_logo.rev.1407789238.jpg" class="lw_image lw_image485 lw_align_right" srcset="/live/image/scale/2x/gid/4/width/180/height/124/485_usda_logo.rev.1407789238.jpg 2x, /live/image/scale/3x/gid/4/width/180/height/124/485_usda_logo.rev.1407789238.jpg 3x" data-max-w="1233" data-max-h="850"/>Data and research regarding the following <strong>USDA Nutrition Assistance</strong> programs are available through this site:</p><ul><li>Supplemental Nutrition Assistance Program (SNAP) </li><li>Food Distribution Programs </li><li>School Meals </li><li>Women, Infants and Children </li></ul><p> Quick link: <a href="http://www.fns.usda.gov/data-and-statistics" target="_blank">http://www.fns.usda.gov/data-and-statistics</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>MapStats</h3><p> A feature of FedStats, MapStats allows users to search for <strong>state, county, city, congressional district, or Federal judicial district data</strong> (demographic, economic, and geographic).</p><p> Quick link: <a href="http://www.fedstats.gov/mapstats/" target="_blank">http://www.fedstats.gov/mapstats/</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>HUD State of the Cities Data Systems</h3><p><strong><img width="200" height="200" alt="" src="/live/image/gid/4/width/200/height/200/482_hud_logo.rev.1407788472.jpg" class="lw_image lw_image482 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/200/height/200/482_hud_logo.rev.1407788472.jpg 2x, /live/image/scale/3x/gid/4/width/200/height/200/482_hud_logo.rev.1407788472.jpg 3x" data-max-w="612" data-max-h="613"/>The SOCDS provides data for individual Metropolitan Areas, Central Cities, and Suburbs.</strong> It is a portal for non-national data made available through a number of outside institutions (e.g. Census, BLS, FBI and others).</p><p> Quick link: <a href="http://www.huduser.org/portal/datasets/socds.html" target="_blank">http://www.huduser.org/portal/datasets/socds.html</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>Federal Aviation Administration: Accident & Incident Data</h3><p><img width="100" height="100" alt="" src="/live/image/gid/4/width/100/height/100/80_faa-logo.rev.1402681347.jpg" class="lw_image lw_image80 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/100/height/100/80_faa-logo.rev.1402681347.jpg 2x, /live/image/scale/3x/gid/4/width/100/height/100/80_faa-logo.rev.1402681347.jpg 3x" data-max-w="550" data-max-h="550"/>The NTSB issues an accident report following each investigation. These reports are available online for reports issued since 1996, with older reports coming online soon. The reports listing is sortable by the event date, report date, city, and state.</p><p> Quick link: <a href="http://www.faa.gov/data_research/accident_incident/" target="_blank">http://www.faa.gov/data_research/accident_incident/</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>NOAA National Climatic Data Center</h3><p><img width="200" height="198" alt="" src="/live/image/gid/4/width/200/height/198/483_noaa_logo.rev.1407788692.jpg" class="lw_image lw_image483 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/200/height/198/483_noaa_logo.rev.1407788692.jpg 2x, /live/image/scale/3x/gid/4/width/200/height/198/483_noaa_logo.rev.1407788692.jpg 3x" data-max-w="954" data-max-h="945"/>NOAA’s National Climatic Data Center (NCDC) is responsible for preserving, monitoring, assessing, and providing public access to the Nation’s treasure of <strong>climate and historical weather data and information</strong>.</p><p> Quick link to home page: <a href="http://www.ncdc.noaa.gov/" target="_blank">http://www.ncdc.noaa.gov/</a></p><p> Quick link to NCDC’s climate and weather datasets, products, and various web pages and resources: <a href="http://www.ncdc.noaa.gov/data-access/quick-links" target="_blank">http://www.ncdc.noaa.gov/data-access/quick-links</a></p><p> Quick link to Text & Map Search: <a href="http://www.ncdc.noaa.gov/cdo-web/" target="_blank">http://www.ncdc.noaa.gov/cdo-web/</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>National Center for Education Statistics</h3><p><strong><img width="400" height="80" alt="" src="/live/image/gid/4/width/400/height/80/479_nces.rev.1407787656.jpg" class="lw_image lw_image479 lw_align_right" data-max-w="400" data-max-h="80"/>The National Center for Education Statistics (NCES) is the primary federal entity for collecting and analyzing data related to education in the U.S. and other nations.</strong> NCES is located within the U.S. Department of Education and the Institute of Education Sciences. NCES has an extensive Statistical Standards Program that consults and advises on methodological and statistical aspects involved in the design, collection, and analysis of data collections in the Center. To learn more about the NCES, <a href="http://nces.ed.gov/about/" target="_blank">click here</a>.</p><p> Quick link to NCES Data Tools: <a href="http://nces.ed.gov/datatools/index.asp?DataToolSectionID=4" target="_blank">http://nces.ed.gov/datatools/index.asp?DataToolSectionID=4</a></p><p> Quick link to Quick Tables and Figures: <a href="http://nces.ed.gov/quicktables/" target="_blank">http://nces.ed.gov/quicktables/</a></p><p> Quick link to NCES Fast Facts (Note: The primary purpose of the Fast Facts website is to provide users with concise information on a range of educational issues, from early childhood to adult learning.): <a href="http://nces.ed.gov/fastfacts/" target="_blank">http://nces.ed.gov/fastfacts/#</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>Federal Reserve Economic Data (FRED®)</h3><p><strong><img width="180" height="79" alt="" src="/live/image/gid/4/width/180/height/79/481_fred-logo.rev.1407788243.jpg" class="lw_image lw_image481 lw_align_right" data-max-w="222" data-max-h="97"/>An online database consisting of more than 72,000 economic data time series from 54 national, international, public, and private sources.</strong> FRED®, created and maintained by Research Department at the Federal Reserve Bank of St. Louis, goes far beyond simply providing data: It combines data with a powerful mix of tools that help the user understand, interact with, display, and disseminate the data.</p><p> Quick link to data page: <a href="http://research.stlouisfed.org/fred2/tags/series" target="_blank">http://research.stlouisfed.org/fred2/tags/series</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>Internal Revenue Service: Tax Statistics</h3><p><img width="155" height="200" alt="" src="/live/image/gid/4/width/155/height/200/486_irs_logo.rev.1407789424.jpg" class="lw_image lw_image486 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/155/height/200/486_irs_logo.rev.1407789424.jpg 2x" data-max-w="463" data-max-h="596"/>Find statistics on business tax, individual tax, charitable and exempt organizations, IRS operations and budget, and income (SOI), as well as statistics by form, products, publications, papers, and other IRS data.</p><p> Quick link to <strong>Tax Statistics, where you will find a wide range of tables, articles, and data</strong> that describe and measure elements of the U.S. tax system: <a href="http://www.irs.gov/uac/Tax-Stats-2" target="_blank">http://www.irs.gov/uac/Tax-Stats-2</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>The Penn World Table</h3><p> The Penn World Table provides purchasing power parity and national income accounts converted to international prices for 189 countries/territories for some or all of the years 1950-2010.</p><p><a href="https://pwt.sas.upenn.edu/php_site/pwt71/pwt71_form.php" target="_blank">Quick link.</a> </p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>Congressional Budget Office</h3><p><img width="180" height="180" alt="" src="/live/image/gid/4/width/180/height/180/380_cbo-logo.rev.1406822035.jpg" class="lw_image lw_image380 lw_align_right" data-max-w="180" data-max-h="180"/>Since its founding in 1974, the Congressional Budget Office (CBO) has produced independent analyses of budgetary and economic issues to support the Congressional budget process.</p><p> The agency is strictly nonpartisan and conducts objective, impartial analysis, which is evident in each of the dozens of reports and hundreds of cost estimates that its economists and policy analysts produce each year. CBO does not make policy recommendations, and each report and cost estimate discloses the agency’s assumptions and methodologies. <strong>CBO provides budgetary and economic information in a variety of ways and at various points in the legislative process.</strong> Products include baseline budget projections and economic forecasts, analysis of the President’s budget, cost estimates, analysis of federal mandates, working papers, and more.</p><p> Quick link to Products page: <a href="http://www.cbo.gov/about/our-products" target="_blank">http://www.cbo.gov/about/our-products</a></p><p> Quick link to Topics: <a href="http://www.cbo.gov/topics" target="_blank">http://www.cbo.gov/topics</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>