The Benefits of Crowdsourcing UFO Data

Aug 2, 2023

By Laura, Senior Data Scientist at Enigma Labs. Laura has a PhD in computational chemistry and previously worked on solar technology.

Military sightings — a classification challenge

When the U.S. Navy declassified three well known UAP videos in April 2020, "FLIR," "GOFAST" and "GIMBAL," there was much public excitement and scrutiny. Many people would like to see more of these military sightings to judge with their own eyes what UAP might be. Senators like Chuck Schumer are pushing for legislation to declassify more military cases.

Two months ago, during NASA's independent panel discussion on UAP, Dr. Nicola Fox, Associate Administrator for NASA's Science Mission Directorate, pointed out the challenge around declassifying military sightings cases. “Unidentified Anomalous Phenomena sightings themselves are not classified,” she said. "It's often the sensor platform that is classified.”

What Dr Fox meant is that if a FLIR image is taken on an F-16 plane, the image is not typically classified because of its content. It's classified because the airforce or navy do not want to disclose to other countries what radar or sensors the aircraft had on board that detected the object. As a result of this classification rubric, the public has not seen most of the UAP images and sensor capture purportedly taken by military pilots, regardless of whether they are anomalous or known objects.

Military sightings — a quantity challenge

Beyond the classification challenges, another issue is that the total number of UAP officially reported by the military is still relatively small. The 2021 DNI Preliminary Assessment tallied all military sightings from November 2004 to March 2021 — for a total of just 144 reports. One year later, the 2022 DNI report stated that, “there have been 247 new reports and another 119 that were either since discovered or reported after the preliminary assessment’s time period. This totals 510 UAP reports as of 30 August 2022.” By the Senate Armed Services Hearings in April, that number had grown to 650. While the number is growing and reporting efforts are ongoing, 650 is still a small number from which to conduct meaningful analysis.

There are likely plenty of reasons why this number could remain small. Beyond the fact that there is still friction, uncertainty and stigma for military pilots to log any anomalous objects they see, U.S. military bases and aircraft carriers cover a tiny percentage of U.S. landmass and coastlines.

The Rise of Citizen Science

The good news is that today we have a much larger data source to work with — unclassified civilian reports. Theoretical astrophysicist David Spergel, an emeritus professor at Princeton University and President of the Simons Foundation, had this to say about civilian UAP sightings: “This is an opportunity for citizen science… smartphones are fabulous data collectors, and there's three to four billion of them on the planet.”

Although the imagery and sensors on board an F/A-18 fighter jet are, unsurprisingly, much more sophisticated than the camera and sensors on your smartphone, it's worth running the math on the potential for civilian reports. If only a fraction of the 5 billion smartphone users on earth submitted a possible UAP sighting, and say only a fraction of those sightings were of decent quality, we would still have hundreds of thousands of useful cases to analyze. The sheer volume of data provided by the public can allow for a much broader and more comprehensive study of UAP than classified datasets ever could, and could also complement the smaller, classified datasets.

How Crowdsourcing Drives Scientific Collaboration

Beyond quantity, another huge benefit to unclassified data is that the scientific community can openly communicate around it. Open collaboration is easier than working behind closed doors, and data transparency accelerates the chances of a breakthrough.

Crowdsourcing can also harness fresh and alternative expertise to solve problems. There is strong precedent for how crowdsourcing has successfully contributed to scientific and humanitarian progress. One example is NASA's Curiosity and Perseverance rovers, which carry a rich array of instruments and sends information from Mars to Earth. That data needs to be rapidly analyzed so that instructions can be sent back to the rovers. Those instructions enables the rovers to prioritize what operations to perform next. So NASA launched the DrivenData challenge, tapping science groups who could help turbocharge the analysis of planetary data in real-time. Recognizing the breadth of potential benefits, in 2011 NASA established the Center of Excellence for Collaborative Innovation (CoECI) to leverage crowdsourcing for research and development.

[Citizen scientists competed to create a model that automatically analyzes mass spectrometry data collected for Mars exploration]

Harnessing Crowdsourcing to Power Data Labeling and AI

Crowdsourcing can also help solve a perennial problem in data analysis — labeled data. In a 2014 lunar crater study by UC Boulder, thousands of volunteers helped count and identify craters. By harnessing the public, the study was able to reduce measurement error with statistical methods that are not possible with only a small team. And it turned out that the group of thousands of volunteers was able to count and identify craters just as accurately as the group of eight professionals who had many years more experience with lunar craters.

Crater labeling on the moon performed by experts versus crowdsourced volunteers

In the case of UAP sightings, there's huge and untapped potential in manual labeling, which we have begun on the team at Enigma. Just as with the moon crater study, both experts and crowds can be used to build high quality labeled datasets (text and media) of identifiable objects and their movement patterns, including satellites, aircraft, drones and celestial objects. A machine learning model can train on those labeled datasets, and when shown objects in new data, judge whether they are identifiable or unidentifiable. Over time, machine learning might even be able to spot patterns overlooked by human eyes and unlock insights that were previously elusive.

This becomes increasingly important in a real-time environment. Today, U.S. airspace is already cluttered. There is too much noise. With the rise of drones and advanced aircraft, the flood of airspace data will only get worse. Machine learning will enable us to extract details faster and at scale.

Improving Crowdsourced Data by Standardizing Reporting

Labeling data is one way to drive progress. Another critical step to benefit from machine learning is moving from unstructured to structured data. AI algorithms base their output on the data that they ingest. If the input can be improved, the output can be improved. To date, the limited amount of structured UAP reports has hampered research.

That is why at Enigma, we focus on standardization of reporting. We screen incoming data and ask the right questions about the nature of UAP sightings in order to minimize the need for inference. Using AI, we have also developed linguistic categories that describe sightings more accurately. These richer and cleaner datasets enhance the capability to identify what is in our skies.

Over time, we hope that researchers will be able to both contribute to and leverage Enigma's standardized and enriched sighting datasets to perform analyses that were not previously possible. With the contribution of each new sighting, data point and insight, we inch closer to uncovering the truths out there.

– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –