Specific Aims

Recent technological advances, from self-tracking applications and devices to genetic sequencing, have produced a torrent of structured quantitative data on all aspects of human existence, including diet, physical activity, sleep, social interaction, environmental factors, symptom severity, vital signs, and others. This data holds tremendous potential to obtain personalized effectiveness values (similar to recommended daily values) for treatments and reveal root causes of chronic conditions, particularly mental illnesses such as depression.

Because there was no open-source universal large-scale platform capable of aggregating this disparate data and deriving new scientific insights, Crowdsourcing Cures has developed a web framework and mobile applications for collecting, integrating, analyzing, and visualizing quantitative data from a wide array of sources.

We propose to further enhance the platform’s data collection, analysis, and sharing capabilities by carrying out a three-phase advanced development plan. The successful completion of this plan will result in a dramatic acceleration in the pace of clinical research and discovery. Building on decades of experience in data science, years of development, significant personal investment, and on our established collaborations with leaders in biomedical computing, informatics, and big data science, we will:

1. LEARN – Collect and Aggregate Data

Acquire, integrate, and normalize heterogeneous data from devices and applications


To acquire, extract, transform, and normalize the countless unstandardized data export file formats and data structures and load them into a standardized structure that can be easily analyzed in order to derive clinical insight.


  • API – application programming interface (API) for receiving and sharing data
  • a spreadsheet upload/import module
  • a connector framework to pull data from existing third-party APIs
  • software development kits (SDKs)


The API connector framework will allow the ongoing regular import of user data after single user authorization.  SDK’s will enable developers to implement easy automatic sharing options in their applications. An increase in the quantity of data will produce a proportional increase in the number of clinical discoveries made.

2. THINK – Analyze Data

Calculate the personalized correlation between every quantifiable factor and symptom severity and determine optimal daily values of these factors for each user


To quantify the effectiveness of treatments for specific individuals, reveal hidden factors exacerbating their illness, and determine personalized optimal daily values for these factors.


We will develop time-series data mining algorithms to quantify correlations between every combination of variables for a given subject. We will also design algorithms capable of determining the minimum quantities of nutrient intake, sleep, exercise, medications, and other factors necessary to minimize symptom severity.


This will mitigate the incidence of chronic illnesses by informing the user of symptom triggers, such as dietary sensitivities, to be avoided. This will also assist patients and clinicians in assessing the effectiveness of treatments despite the hundreds of uncontrollable variables in any prescriptive experiment.

3. DO – Publish Discoveries

Establish a research commons to anonymously pool data in stratified user groups and share discoveries


To allow users to publish their findings and reduce error in the predictive analysis by increasing user sample size through the grouping of data from relatively homogeneous groups of users.


We will expand the Journal of Citizen Science content management system and the Crowdsourcing Cures API to serve as a platform where anyone can share, access, and analyze anonymous data and publish studies. We will also enable the grouping of data among relatively homogenous groups of users stratified by their environmental, microbiomic, demographic, genomic, and/or disease profiles.


Clinicians and those suffering from chronic conditions will have access to the personalized effectiveness rates of treatments and the percent likelihood of root causes.

This work will usher in an era of personalized preventative medicine through crowdsourced clinical research by providing:

  • a new secure platform capable of aggregating massive amounts of heterogeneous life-tracking data
  • a tool to help clinicians and those suffering from chronic conditions determine personalized effectiveness rates of treatments and the percent likelihood of root causes
  • the ability to run and publish large-scale observational research studies in a matter of minutes on stratified user groups