Ryan Hsu

Ryan Hsu

Mar 10, 2017

Group 6 Copy 1,819
5

Data!

Hi backers!

We finally received the DNA sequencing data from uBiome. Altogether it's about 12GB of data in FASTQ format, which are basically fancy text files of ATCGs (4 bases of DNA), quality/confidence scores, and other metadata.

To analyze the data, I'm currently building a computational data analysis pipeline in Python, which processes the data and calculate metrics.

The main steps in processing the data are:

  1. Merge the reads. It's a bit like taking two photos side by side, then combining the photos where they overlap. In the same way, we stitch together a longer DNA sequence from two shorter ones.

  2. Trimming and quality filtering. Artifacts from DNA amplification and sequencing are present in sequencing data, so we discard reads that do no meet our quality thresholds.

  3. OTU (Operational Taxonomic Unit) clustering. This groups similar sequences in each sample into what are called OTUs, which are representative of species. From here, we can look up each OTU against large 16S databases to figure out what bacteria each DNA read came from.

  4. Calculate diversity metrics and identifying significant changes between Soylent and Regular diet groups.

Here's a screenshot showing some of the process. On the top left we have an open FASTQ file with sequences of DNA. As you can tell, it's not really meant for humans to read. On the bottom left we have part of the analysis pipeline running. On the right side, there's some of the code I've written to manipulate the data. Some of the bioinformatic tools we are going to use include UPARSE, QIIME, and mothur.

Once we have our initial analysis complete, we will be meeting with faculty at UC Berkeley to get some more perspectives on the data. We also have a few members in the Arkin Lab who are guiding in formally writing up our findings.

We'll keep you up to date with what we find!

5 comments

Join the conversation!Sign In
  • Asa Kaplan
    Asa Kaplan
    Hey, is there any chance you'd think about posting the anonymized raw data in the mean time? I'd be interested in taking a crack at coding preliminary results.
    Sep 13, 2017
  • Nader Rasoul
    Nader Rasoul
    Hope you'll be posting the results soon! As Denny Luan suggested, it would also be nice to see your code as well.
    Sep 01, 2017
  • James Cho
    James Cho
    Very interested, as a long time Soylent drinker. I've been drinking lots of kombucha and eating yogurt to compensate but have no idea if it actually helps, or if it's needed.
    Jul 23, 2017
  • Henry Reed
    Henry Reed
    Can't wait to see these results!!!
    Jul 23, 2017
  • Denny Luan
    Denny LuanBacker
    You should post your python code! I'm curious to see it :)
    Mar 10, 2017

About This Project

As students carry out their busy lifestyles, many are turning to inexpensive and convenient drink based meal-alternatives, such as Soylent, to supplement or replace their regular diets. These meal alternatives are designed to fulfill the nutritional needs of a human, but the impact on the microbiome remain unknown. This project aims to track the composition of participant's microbiome before, during, and after Soylent use to more holistically understand Soylent's impact on microbiome health.

Blast off!

Browse Other Projects on Experiment

Related Projects

Urban Pollination: sustain native bees & urban crops

Bee activity on our crop flowers is crucial to human food security, but bees are also declining around the...

Cannibalism in Giant Tyrannosaurs

This is the key question we hope to answer with this study. This project is to fund research into a skull...

Seattle HiveBio Community Lab

Thank you to everyone who has supported HiveBio thus far. As of April 17th we've reached our basic funding...

Backer Badge Funded

Add a comment