• <div class="header-image" style="background-image: url(/live/image/gid/4/2756_V6N7Web_Header_small.rev.1533574420.jpg);">​</div><div class="header-background-color"/>

Big Data’s Big Role in Finance and Financial Regulation

August 01, 2018

Picture this: you’re tired of the spam in your inbox, so you download a new app for your browser that blocks it. While downloading, the Terms of Agreement pop up, and you click ‘Agree’ – because why wouldn’t you? Unbeknownst to you, while you are now enjoying your spam-free email, the Slice Technologies app is analyzing your emails for purchase receipts and selling this anonymized data to hedge funds.[1] Is that an invasion of privacy? Not quite, as you agreed to the terms. But why would hedge funds, and other investment advisers, want this information? Well, with this kind of alternative data, investment firms can make much more accurate predictions about a company’s sales revenue and its health. This new world of alternative data poses incredible alpha-creating potential for investment advisers, as well as new legal concerns for the courts and regulators.

As the old adage goes, knowledge is power. In today’s information age, data is being created incessantly by nearly everything we do: the websites we visit, the places we go, the purchases we make, and much more. This alternative data, paired with powerful computational analysis, has the ability to inform financial decisions and identify economic trends well before traditional analyses ever could – and investors are catching on. Alternative data are data sources outside of the typical traditional sources used by investment advisers, like government reports, SEC filings, and quarterly earnings reports. Alternative data can be satellite imagery of crops used to predict yields or anonymized reports of what people are purchasing with their credit and debit cards. This data is typically collected and cleaned by data vendors that package and sell it to Wall Street for millions. According to AlternativeData.org, in 2016, the alternative data market was worth $200 million, and by 2020, spending on alternative datasets and associated infrastructure is expected to reach $1.7 billion.[2] Some larger firms like Blackrock and Goldman Sachs already use internal programs to interpret news stories, analyst reports, and social media by using natural language processing to make informed inferences about capital markets.[3]

Caption: Alternative data provides, like Orbital Insight, forecast corn yields using satellite imagery. They often predict yield before the US Department of Agriculture releases official statistics, and do so more accurately. Source: The EconomistCaption: Alternative data provides, like Orbital Insight, forecast corn yields using satellite imagery. They often predict yield before the US Department of Agriculture releases official statistics, and do so more accurately. Source: The Economist

Alternative data provide algorithms with the fuel to make financial decisions and predictions more accurately than ever before, and investment advisers are increasingly incorporating quantitative trading methods into their investment strategies. According to the Tabb Group, quantitative hedge funds account for 27% of all U.S. stock trades by investors, and these firms are constantly searching for more innovative and effective sources of data.[4] For instance, data tracking corporate jets is used to predict mergers and/or acquisitions. In 2017, hedge fund managers anticipated the acquisition of Actelion, a Swiss pharmaceutical company, by Johnson & Johnson after tracking the company’s flights to Switzerland.[5]

New data providers continue to flood the emerging alternative data industry. According to AlternativeData.org, over 300 alternative data providers exist in 2018, up from 100 in 2008.[15]New data providers continue to flood the emerging alternative data industry. According to AlternativeData.org, over 300 alternative data providers exist in 2018, up from 100 in 2008.[15]

While alternative data is extremely valuable, the point at which alternative data crosses the line legally is not yet clear, and regulators know it. When does having this type of granular insight cross the line into say, privacy violations or insider trading? In terms of privacy, data vendors typically try to avoid any liability and remove personally identifiable information (PII) from data that is sold. The Investment Data Standards Organization (IDSO), a 501(c) nonprofit founded in January 2018, creates standards for the new alternative data industry. One of their products, a current work-in-progress, is a list of best practices when dealing with data sets including PII.[6] Fortunately for us, granular level data is not that important to those purchasing alternative datasets; what is important is aggregate level trends for entire companies or industries that can be inferred from the sum of many individuals’ purchases or other behavior. However, some states are taking a more proactive approach to the data protection problem. On June 28, 2018, California passed a bill that requires companies to disclose what kind of data is being collected and to what third parties it is being sold. It also requires that consumers have an ability to opt out of having their data sold.[7]

In terms of insider trading, there are still questions regarding alternative datasets. The point at which alternative data becomes material nonpublic information (MNPI) is not yet clearly defined. According to Bloomberg Law, much of the risk that a data purchaser takes on lies in ensuring the data vendor has the legal right to sell the data at all.[8] Investment firms can minimize this risk by asking data vendors for warranty that the data sets do not include MNPI, and that the data was collected in a way that did not breach the vendor’s legal duties to the source. Tammer Kamel, CEO of an alternative data provider Quandl, prefers to purchase data from companies that provide investor information on different companies to avoid the idea that the information came from an insider.[9] However, much of the discussion on the role of alternative data in insider trading will hinge on what is considered nonpublic information. For example, while alternative data on new iPhone sales may not be widely disseminated, nothing is stopping an investment adviser from surveying everyone that walks out of an Apple store to ask if they bought the new iPhone.

However, clear cases of insider trading using alternative data do exist. In SEC vs. Huang, two former Capital One employees were prosecuted for misappropriating a confidential database of credit card purchases and making a $2.8 million profit on a $147,300 investment.[10] The employees made searches on the database of transactions to better predict the sales of public companies ahead of their quarterly earnings reports. In one instance, after analyzing sales of outdoor gear retailer Cabela’s, the employee purchased $51,890 in put option contracts. The next day, after Cabela’s announced a 10% decrease in sales, he sold the contracts and made a 108% return.[11] Here, the SEC successfully argued that this case met the criteria for insider trading: material nonpublic information was misappropriated in a breach of duty, and was then traded on. While this fits the bill for misappropriated insider trading, this example illustrates how data can and will be misused to outperform the market going forward and why regulatory agencies must be able to address this new problem.

As big data and alternative data continue to play a larger role in investment decisions, the U.S. Securities and Exchange Commission (SEC) and other regulatory bodies will have to keep up to ensure fair and efficient markets in the changing financial landscape. With the power of big data, regulatory agencies like the SEC and FINRA can move away from reactionary enforcement policies – like only starting investigations after suspicious activity has been detected – and toward a more proactive approach. Using a “trader-based” method of detection allows agencies to identify suspicious trading patterns among networks of traders and potential sources of material nonpublic information using analytics. The SEC has already started with the creation of the Market Abuse Unit (MAU).[12] In one 2017 case, the MAU was credited with detecting “patterns of insider trading” that led to the sentencing of an investment banker and plumber on insider trading charges.[13],[14] The banker had sold tips on the status of 10 mergers and acquisitions, allowing the other to earn over $76,000 in profits. With new techniques and technology, the SEC and FINRA, among others, will be able to more effectively identify illicit activity.

While innovative uses of big data propel investment firms forward in terms of performance, investment advisers must be careful to avoid legal risks that include privacy violations and insider trading. This unprecedented era of data collection and analysis has changed the financial landscape, but it will also change the regulatory landscape: enforcement agencies must adapt and use the same analytical techniques to detect illegal activity more efficiently.

Student Blog Disclaimer
  • The views expressed on the Student Blog are the author’s opinions and don’t necessarily represent the Penn Wharton Public Policy Initiative’s strategies, recommendations, or opinions.











  [10]https://www.integrity-research.com/wp-content/uploads/2018/01/Mitigating-Legal-Risks-Alternative-Data-January-2018-2.pdf, page 10.







  • <h3>NOAA National Climatic Data Center</h3><p><img width="200" height="198" alt="" src="/live/image/gid/4/width/200/height/198/483_noaa_logo.rev.1407788692.jpg" class="lw_image lw_image483 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/200/height/198/483_noaa_logo.rev.1407788692.jpg 2x, /live/image/scale/3x/gid/4/width/200/height/198/483_noaa_logo.rev.1407788692.jpg 3x" data-max-w="954" data-max-h="945"/>NOAA’s National Climatic Data Center (NCDC) is responsible for preserving, monitoring, assessing, and providing public access to the Nation’s treasure of <strong>climate and historical weather data and information</strong>.</p><p> Quick link to home page: <a href="http://www.ncdc.noaa.gov/" target="_blank">http://www.ncdc.noaa.gov/</a></p><p> Quick link to NCDC’s climate and weather datasets, products, and various web pages and resources: <a href="http://www.ncdc.noaa.gov/data-access/quick-links" target="_blank">http://www.ncdc.noaa.gov/data-access/quick-links</a></p><p> Quick link to Text & Map Search: <a href="http://www.ncdc.noaa.gov/cdo-web/" target="_blank">http://www.ncdc.noaa.gov/cdo-web/</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>The Penn World Table</h3><p> The Penn World Table provides purchasing power parity and national income accounts converted to international prices for 189 countries/territories for some or all of the years 1950-2010.</p><p><a href="https://pwt.sas.upenn.edu/php_site/pwt71/pwt71_form.php" target="_blank">Quick link.</a> </p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>The World Bank Data (U.S.)</h3><p><img width="130" height="118" alt="" src="/live/image/gid/4/width/130/height/118/484_world-bank-logo.rev.1407788945.jpg" class="lw_image lw_image484 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/130/height/118/484_world-bank-logo.rev.1407788945.jpg 2x, /live/image/scale/3x/gid/4/width/130/height/118/484_world-bank-logo.rev.1407788945.jpg 3x" data-max-w="1406" data-max-h="1275"/>The <strong>World Bank</strong> provides World Development Indicators, Surveys, and data on Finances and Climate Change.</p><p> Quick link: <a href="http://data.worldbank.org/country/united-states" target="_blank">http://data.worldbank.org/country/united-states</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>Federal Aviation Administration: Accident & Incident Data</h3><p><img width="100" height="100" alt="" src="/live/image/gid/4/width/100/height/100/80_faa-logo.rev.1402681347.jpg" class="lw_image lw_image80 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/100/height/100/80_faa-logo.rev.1402681347.jpg 2x, /live/image/scale/3x/gid/4/width/100/height/100/80_faa-logo.rev.1402681347.jpg 3x" data-max-w="550" data-max-h="550"/>The NTSB issues an accident report following each investigation. These reports are available online for reports issued since 1996, with older reports coming online soon. The reports listing is sortable by the event date, report date, city, and state.</p><p> Quick link: <a href="http://www.faa.gov/data_research/accident_incident/" target="_blank">http://www.faa.gov/data_research/accident_incident/</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>Internal Revenue Service: Tax Statistics</h3><p><img width="155" height="200" alt="" src="/live/image/gid/4/width/155/height/200/486_irs_logo.rev.1407789424.jpg" class="lw_image lw_image486 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/155/height/200/486_irs_logo.rev.1407789424.jpg 2x" data-max-w="463" data-max-h="596"/>Find statistics on business tax, individual tax, charitable and exempt organizations, IRS operations and budget, and income (SOI), as well as statistics by form, products, publications, papers, and other IRS data.</p><p> Quick link to <strong>Tax Statistics, where you will find a wide range of tables, articles, and data</strong> that describe and measure elements of the U.S. tax system: <a href="http://www.irs.gov/uac/Tax-Stats-2" target="_blank">http://www.irs.gov/uac/Tax-Stats-2</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>National Center for Education Statistics</h3><p><strong><img width="400" height="80" alt="" src="/live/image/gid/4/width/400/height/80/479_nces.rev.1407787656.jpg" class="lw_image lw_image479 lw_align_right" data-max-w="400" data-max-h="80"/>The National Center for Education Statistics (NCES) is the primary federal entity for collecting and analyzing data related to education in the U.S. and other nations.</strong> NCES is located within the U.S. Department of Education and the Institute of Education Sciences. NCES has an extensive Statistical Standards Program that consults and advises on methodological and statistical aspects involved in the design, collection, and analysis of data collections in the Center. To learn more about the NCES, <a href="http://nces.ed.gov/about/" target="_blank">click here</a>.</p><p> Quick link to NCES Data Tools: <a href="http://nces.ed.gov/datatools/index.asp?DataToolSectionID=4" target="_blank">http://nces.ed.gov/datatools/index.asp?DataToolSectionID=4</a></p><p> Quick link to Quick Tables and Figures: <a href="http://nces.ed.gov/quicktables/" target="_blank">http://nces.ed.gov/quicktables/</a></p><p> Quick link to NCES Fast Facts (Note: The primary purpose of the Fast Facts website is to provide users with concise information on a range of educational issues, from early childhood to adult learning.): <a href="http://nces.ed.gov/fastfacts/" target="_blank">http://nces.ed.gov/fastfacts/#</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>HUD State of the Cities Data Systems</h3><p><strong><img width="200" height="200" alt="" src="/live/image/gid/4/width/200/height/200/482_hud_logo.rev.1407788472.jpg" class="lw_image lw_image482 lw_align_left" srcset="/live/image/scale/2x/gid/4/width/200/height/200/482_hud_logo.rev.1407788472.jpg 2x, /live/image/scale/3x/gid/4/width/200/height/200/482_hud_logo.rev.1407788472.jpg 3x" data-max-w="612" data-max-h="613"/>The SOCDS provides data for individual Metropolitan Areas, Central Cities, and Suburbs.</strong> It is a portal for non-national data made available through a number of outside institutions (e.g. Census, BLS, FBI and others).</p><p> Quick link: <a href="http://www.huduser.org/portal/datasets/socds.html" target="_blank">http://www.huduser.org/portal/datasets/socds.html</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>National Bureau of Economic Research (Public Use Data Archive)</h3><p><img width="180" height="43" alt="" src="/live/image/gid/4/width/180/height/43/478_nber.rev.1407530465.jpg" class="lw_image lw_image478 lw_align_right" data-max-w="329" data-max-h="79"/>Founded in 1920, the <strong>National Bureau of Economic Research</strong> is a private, nonprofit, nonpartisan research organization dedicated to promoting a greater understanding of how the economy works. The NBER is committed to undertaking and disseminating unbiased economic research among public policymakers, business professionals, and the academic community.</p><p> Quick Link to <strong>Public Use Data Archive</strong>: <a href="http://www.nber.org/data/" target="_blank">http://www.nber.org/data/</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>Federal Reserve Economic Data (FRED®)</h3><p><strong><img width="180" height="79" alt="" src="/live/image/gid/4/width/180/height/79/481_fred-logo.rev.1407788243.jpg" class="lw_image lw_image481 lw_align_right" data-max-w="222" data-max-h="97"/>An online database consisting of more than 72,000 economic data time series from 54 national, international, public, and private sources.</strong> FRED®, created and maintained by Research Department at the Federal Reserve Bank of St. Louis, goes far beyond simply providing data: It combines data with a powerful mix of tools that help the user understand, interact with, display, and disseminate the data.</p><p> Quick link to data page: <a href="http://research.stlouisfed.org/fred2/tags/series" target="_blank">http://research.stlouisfed.org/fred2/tags/series</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>Congressional Budget Office</h3><p><img width="180" height="180" alt="" src="/live/image/gid/4/width/180/height/180/380_cbo-logo.rev.1406822035.jpg" class="lw_image lw_image380 lw_align_right" data-max-w="180" data-max-h="180"/>Since its founding in 1974, the Congressional Budget Office (CBO) has produced independent analyses of budgetary and economic issues to support the Congressional budget process.</p><p> The agency is strictly nonpartisan and conducts objective, impartial analysis, which is evident in each of the dozens of reports and hundreds of cost estimates that its economists and policy analysts produce each year. CBO does not make policy recommendations, and each report and cost estimate discloses the agency’s assumptions and methodologies. <strong>CBO provides budgetary and economic information in a variety of ways and at various points in the legislative process.</strong> Products include baseline budget projections and economic forecasts, analysis of the President’s budget, cost estimates, analysis of federal mandates, working papers, and more.</p><p> Quick link to Products page: <a href="http://www.cbo.gov/about/our-products" target="_blank">http://www.cbo.gov/about/our-products</a></p><p> Quick link to Topics: <a href="http://www.cbo.gov/topics" target="_blank">http://www.cbo.gov/topics</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>USDA Nutrition Assistance Data</h3><p><img width="180" height="124" alt="" src="/live/image/gid/4/width/180/height/124/485_usda_logo.rev.1407789238.jpg" class="lw_image lw_image485 lw_align_right" srcset="/live/image/scale/2x/gid/4/width/180/height/124/485_usda_logo.rev.1407789238.jpg 2x, /live/image/scale/3x/gid/4/width/180/height/124/485_usda_logo.rev.1407789238.jpg 3x" data-max-w="1233" data-max-h="850"/>Data and research regarding the following <strong>USDA Nutrition Assistance</strong> programs are available through this site:</p><ul><li>Supplemental Nutrition Assistance Program (SNAP) </li><li>Food Distribution Programs </li><li>School Meals </li><li>Women, Infants and Children </li></ul><p> Quick link: <a href="http://www.fns.usda.gov/data-and-statistics" target="_blank">http://www.fns.usda.gov/data-and-statistics</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>
  • <h3>MapStats</h3><p> A feature of FedStats, MapStats allows users to search for <strong>state, county, city, congressional district, or Federal judicial district data</strong> (demographic, economic, and geographic).</p><p> Quick link: <a href="http://www.fedstats.gov/mapstats/" target="_blank">http://www.fedstats.gov/mapstats/</a></p><p>See all <a href="/data-resources/">data and resources</a> »</p>