Illumina 1G Output Synopsis

Processing Illumina output requires large filesytems and sophisticated processing capability. Inexperienced Illumina users are highly advised to collaborate with a lab experienced with processing Illumina data. Labs affiliated with the CGRB have the additional advantage of  high-throughput connectivity capable of quickly off-loading large data sets.

An Illumina run creates raw output files that are then processed by the Genome Analyzer (GA) Pipeline. Using GA Pipeline v1.0,  Illumina runs will have the following space requirements:

Run Type Cycles Tiles/Lane Raw Size Analysis Size Total Size
           
single-read 36 330 680GB  100GB - 400GB 780GB - 1.1TB
paired-end 36 330 1.4TB 400GB - 700GB 1.8TB - 2.1B
           

When the raw data have been collected and run through the pipeline, we will send an email to users affiliated with a given run. This email will notify the users:

  • the run Id of the completed run
  • an ELAND analysis summary table
  • a hyperlink to a run summary

From the time the email is sent to the users the data will remain on the CGRB file server for about two weeks. Depending on the scheduling of subsequent runs the data may remain longer, but there are no guarantees.

Prior to deletion, all runs are backed-up to tape. If a single-read run does not fit on one tape or if a paired-end read does not fit on two tapes, we will delete the Bustard and GERALD data as needed to fit all the raw data on the tape. If necessary, these data can be regenerated from the raw data by re-running the GA pipeline.

We also copy the run summary, ELAND result summary, and IVC plots to the CGRB Core Labs ordering site. This information is viewable by logging in and clicking the "View Data" link in the left-side menu.