Mammalian gene collection

The Mammalian Gene Collection (MGC) project is an NIH initiative. The goal of the project is to make available a complete set of these full-length cDNA clones in human, rat, mouse and cow.

The MGC plan of work is:

  • To produce libraries enriched for full-length cDNA;

  • To sequence the 5' and 3' ends of 768 clones in each library for library validation;

  • To array and sequence from each validated library the 5' ends of 10,000 to 30,000 clones. Each of these clones is given an I.M.A.G.E. ID and becomes part of the main I.M.A.G.E. resource (LLAM and LLCM clones);

  • To apply several algorithms to analyze the 5' sequences to identify clones containing putative N-terminal coding regions. The clones identified are rearrayed into 384-well plates by the I.M.A.G.E. Consortium at LLNL. These plates are labelled with the prefix IRAK, IRAL or IRAM on the I.M.A.G.E. web site and each clone retains its unique IMAGE ID;

  • Only those clones confirmed by sequencing to harbour full-length ORFs are assigned a unique MGC identifier (e.g. MGC_15287 or its synonym MGC:15287) in addition to their original IMAGE ID. These sequences are deposited in GenBank.

  • To rearray clones with a unique MGC_ID into 96-well plates.

In summary a Mammalian Gene Collection (MGC) clone is identified by:

  • A unique IMAGE identifier [IMAGE ID] (cross-referring to one or more EST files in dbEST)

  • A unique MGC identifier [MGC_ID] (cross-referring to a sequence file in GenBank of the entire clone insert)

Description of rearrayed plates containing sequenced full-length MGC clones:


IRAT Human Vectors confer ampicillin resistance
IRAU Human Vectors confer chloramphenicol resistance
IRCM Human Vectors confer Kanamycin resistance
IRAV Mouse Vectors confer ampicillin resistance
IRAW Mouse Vectors confer chloramphenicol resistance
IRCL Mouse Vectors confer Kananmycin resistance
IRBP Rat Vectors confer ampicillin resistance
IRBQ Rat Vectors confer chloramphenicol resistance
IRCJ Cow Vectors confer amplicillin resistance


NB: In each of these subsets all clones on one plate share the same vector ( IRAT, IRAU, IRCM, IRAV, IRAW, IRCL, IRBP, IRBQ and IRCJ). Details of the vectors used can be found here

Clones from the six subsets can be ordered as individual clones (streaks on agar) and 96-well plates (frozen bacterial stocks in glycerol).

Clones are ordered using the IMAGE ID on the standard I.M.A.G.E. order form (you cannot use the MGC_ID). If the clone ordered forms part of a MGC subset in the Source BioScience collection, you will receive the clone streaked out from IRAT, IRAU, IRCM, IRAV, IRAW, IRCL, IRBP, IRBQ and IRCJ plates. Such clones will also have an MGC_ID which will be given on the email confirmation of your order. If a clone has an MGC_ID but is not yet part of the subset plates received by Source BioScience LifeSciences the clone will be streaked out from the main I.M.A.G.E. collection.

For a list of the current plates available in each MGC subset please click here


Strausberg RL, Feingold EA, Klausner RD & Collins FS(1999). The Mammalian Gene Collection. Science 286: 455-457

MGC (Mammalian Gene Collection) Program Team (2002). Generation and Initial Analysis of more than 15000 Full-Length Human and Mouse cDNA sequences. PNAS 99(26): 16899-16903