Mausoleum
Genes nomenclature
Excerpt from Guidelines for Nomenclature of Genes, Genetic Markers, Alleles, and Mutations in Mouse and Rat
Targeted and Trapped Mutations
Knockout, Knockin, Conditional and Other Targeted Mutations
Mutations that are the result of gene targeting by homologous recombination in ES cells are given the symbol of the targeted gene, with a superscript consisting of three parts: the symbol tm to denote a targeted mutation, a serial number from the laboratory of origin and the Laboratory code where the mutation was produced.
- For example, Cftrtm1Unc is the first targeted mutation of the cystic fibrosis transmembrane regulator (Cftr) gene produced at the University of North Carolina.
So-called "knock in" mutations, in which all or part of the coding region of one gene is replaced by another, should be given a tm symbol and the particular details of the knock-in associated with the name in publications or databases. Where there has been a replacement of the complete coding region, the replacing gene symbol can be used parenthetically as part of the allele symbol of the replaced gene along with a Laboratory code and serial number.
- For example, En1tm1(Otx2)Wrst where the coding region of En1 was replaced by the Otx2 gene, originating from the W. Wurst laboratory.
Knock in alleles expressing a RNAi under the control of the endogenous promoter can be designated using targeted mutation or transgene mutation nomenclature, as appropriate:
Example:
Genetm#(RNAi:Xyz)Labcode
When a targeting vector is used to generate multiple germline transmissible alleles, such as in the Cre-Lox system, the original knock-in of loxP would follow the regular tm designation rules. If a second heritable allele was then generated after mating with a cre transgenic mouse, it would retain the parental designation followed by a decimal point and serial number.
- Tfamtm1Lrsn and Tfamtm1.1Lrsn. In this example, Tfamtm1Lrsn designates a targeted mutation where loxP was inserted into the Tfam gene. Tfamtm1.1Lrsn designates another germline transmissible allele generated after mating with a cre transgenic mouse. Note: somatic events generated in offspring from a Tfamtm1Lrsn bearing mouse and a cre transgenic that cause disruption of Tfam in selective tissues would not be assigned nomenclature.
Other more complex forms of gene replacement, such as partial "knock-in", hit-and-run, double replacements, and loxP mediated integrations are not conveniently abbreviated and should be given a conventional tm#Labcode superscript. Details of the targeted locus should be given in associated publications and database entries.
Note that although subtle alterations made in a gene appear to lend themselves to a simple naming convention whereby the base or amino acid changes are specified, in fact these do not provide unique gene names, as such alterations, which could be made in independent labs, while bearing the same changes, may differ elsewhere in the gene.
Large-scale projects that systematically produce a large number of alleles (>1000) may include a project abbreviation in parentheses as part of the allele designation. These should retain the accepted nomenclature features of other alleles of that class. For example, a targeted allele created by Velocigene (Regeneron) in the KOMP knockout project:
Gstm3tm1(KOMP)Vlcg
Once fully designated in a publication, the allele can be abbreviated by removing the portion of the allele designation in parentheses (in this case, Gstm3tm1Vlcg), providing the symbol remains unique.
Endonuclease-induced Mutations
Endonuclease-induced mutations are targeted mutations generated in pluripotent or totipotent cells by an endonuclease joined to sequence-specific DNA binding domains. The mutation is introduced during homology-directed or non-homologous end-joining repair of the induced DNA break(s). Endonuclease-induced mutations are given the symbol of the mutated gene, with a superscript consisting of three parts: the symbol em to denote an endonuclease-induced mutation, a serial number from the laboratory of origin and the Laboratory code where the mutation was produced.
Example:
Fgf1em1Mcw I the first endonuclease-induced mutation of the fibroblast growth factor 1 (Fgf1) gene produced at the Medical College of Wisconsin.
Gene Trap Mutations
Gene trap mutations are symbolized in a similar way to targeted mutations. If the trapped gene is known, the symbol for the trapped allele will be similar to a targeted mutation of the same gene using the format Gt(vector content)#Labcode for the allele designation. Example:
Akap12Gt(ble-lacZ)15Brr a gene trap allele of the Akap12 gene, where the gene trap vector contains a phleomycin resistance gene (ble) and lacZ, the 15th analyzed in the laboratory of Jacqueline Barra (Brr).
If the trapped gene is novel, it should be given a name and a symbol, which includes the letters Gt for "gene trap," the vector in parentheses, a serial number, and Laboratory code.
- For example, a gene trapped locus (where the gene is unknown) using vector ROSA, the 26th made in P. Soriano's laboratory, is Gt(ROSA)26Sor.
For high throughput systematic gene trap pipelines, the mutant ES cell line's designation can be used in parentheses instead of the vector designation, and the serial number following the parentheses may be omitted.
Examples:
Gt(DTM030)Byg for a trapped gene (at an undefined locus) in mutant ES cell line DTM030, made by BayGenomics Osbpl1aGt(OST48536)Lex gene trap allele of the oxysterol binding protein-like 1A gene, in mutant ES cell line OST48536, made by Lexicon Genetics, Inc.
Enhancer Traps
Enhancer traps are specialized transgenes. One utility of these transgenes is in creating cre driver lines. Enhancer traps of this type that are currently being created may include a minimal promoter, introns, a cre recombinase cassette (sometimes fused with another element such as ERT2), and polyA sites from different sources.
Nomenclature for these enhancer traps consists of 4 parts as follows:
Et prefix for enhancer trap cre recombinase cassette portion in parentheses...
for example, cre, icre, or cre/ERT2 (if fused with ERT2)line number or serial number to designate lab trap number or serial number Lab code ILAR code identifying the creator of this enhancer trap
Examples:
Et(icre)1642Rdav Enhancer trap 1642, Ron Davis Et(cre/ERT2)2047Rdav Enhancer trap 2047, Ron Davis
Note that the minimal promoter, poly A source, etc. are not part of the enhancer trap nomenclature. These are molecular details of the specific construct that will be captured in database records and reported with experimental results.
Transgenes
Any DNA that has been stably introduced into the germline of mice or rats is a transgene. Transgenes can be broken down into two categories:
- Those that are produced by homologous recombination as targeted events at particular loci.
- Those that occur by random insertion into the genome (usually by means of microinjection).
Nomenclature for targeted genes is dealt with in the upper section. Random insertion of a transgene in or near an endogenous gene may produce a new allele of this gene. This new allele should be named as described in elsewhere. The transgene itself is a new genetic entity for which a name may be required. This section describes the guidelines for naming the inserted transgene.
It is recognized that it is not necessary, or even desirable, to name all transgenes. For example, if a number of transgenic lines are described in a publication but not all are subsequently maintained or archived, then only those that are maintained require standardized names. The following Guidelines were developed by an interspecies committee sponsored by ILAR in 1992 and modified by the Nomenclature Committee in 1999 and 2000. Transgenic symbols should be submitted to MGD or RGD/RatMap through the usual nomenclature submission form for new loci. The transgene symbol is made up of four parts:
- Tg denoting transgene.
- In parentheses, the official gene symbol of the inserted DNA, using nomenclature conventions of the species of origin.
- The laboratory's line or founder designation or a serial number (note that numbering is independent for mouse and rat series).
- The Laboratory code of the originating lab.
Examples:
Tg(Zfp38)D1Htz a transgene containing the mouse Zfp38 gene, in line D1 reported by Nathaniel Heintz. Tg(CD8)1Jwg a transgene containing the human CD8 gene, the first transgenic line using this construct described by the lab of Jon W. Gordon. Tg(HLA-B*2705, B2M)33-3Trg a double transgene in rat containing the human HLA-B*2705 and B2M genes, that were co-injected, giving rise to line 33-3 by Joel D. Taurog.
Different transgenic constructs containing the same gene should not be differentiated in the symbol; they will use the same gene symbol in parentheses and will be distinguished by the serial number/Laboratory code. Information about the nature of the transgenic entity should be given in associated publications and database entries.
In many cases, a large number of transgenic lines are made from the same gene construct and only differ by tissue specificity of expression. The most common of these are transgenes that use reporter constructs or recombinases (e.g., GFP, lacZ, cre), where the promoter should be specified as the first part of the gene insertion designation, separated by a hyphen from the reporter or recombinase designation. The SV40 large T antigen is another example. The use of promoter designations is helpful in such cases.
Examples:
Tg(Wnt1-LacZ)206Amc the LacZ transgene with a Wnt1 promoter, from mouse line 206 in the laboratory of Andrew McMahon. Tg(Zp3-cre)3Mrt the cre transgene with a Zp3 promoter, the third transgenic mouse line from the laboratory of Gail Martin.
In the case of a fusion gene insert, where roughly equal parts of two genes compose the construct, a forward slash separates the two genes in parentheses.
Example:
Tg(TCF3/HLF)1Mlc a transgene in which the human transcription factor 3 gene and the hepatic leukemia factor gene were inserted. as a fusion chimeric cDNA, the first transgenic mouse line produced by Michael L. Cleary's laboratory (Mlc).
This scheme is to name the transgene entity only. The mouse or rat strain on which the transgene is maintained should be named separately as in the Rules and Guidelines for Nomenclature of Mouse and Rat Strains. In describing a transgenic mouse or rat strain, the strain name should precede the transgene designation.
Examples:
C57BL/6J-Tg(CD8)1Jwg mouse strain C57BL/6J carrying the Tg(CD8)1Jwg transgene. F344/CrlBR-Tg(HLA-B*2705, B2M)33-3Trg rat strain F344/CrlBR carrying the Tg(HLA-B*2705,B2M)33-3Trg double transgene.
For BAC transgenics, the insert designation is the BAC clone and follows the same naming convention as the Clone Registry at NCBI.
Example:
Tg(RP22-412K21)15Som a BAC transgene where the inserted BAC is from the RP22 BAC library, plate 412, row K, column 21. It is the 15th in the mouse made in the laboratory of Stefan Somlo (Som).
Transgenes containing RNAi constructs can be designated minimally as:
Tg(RNAi:geneX)#Labcode, where geneX is the gene that is knocked down # is the serial number of the transgene
An expanded version of this designation is:
Tg(Pro-yyRNAi:geneX)#Labcode, where Pro- can be used optionally to designate the promoter yy can be used optionally for the specific RNAi construct
While there is the option to include significant information on vectors, promoters, etc. within the parentheses of a transgene symbol, this should be minimized for brevity and clarity. The function of a symbol is to provide a unique designation to a gene, locus, or mutation. The fine molecular detail of these loci and mutations should reside in databases such as MGD and RGD.