The file "uniprot-XX-XXXX-XXXX.tsv.gz" contains mappings from Uniprot Accessions to EggNOG Orthologous Groups (OGs) in 4 taxonomic levels: LUCA, Bacteria, Eukaryotes, Archaea. ABOUT THE MAIN FILE CONTENT =============================== - Each row represents a Uniprot Accession - Columns are tab delimited. - Columns 2, 3, 4 and 5 provide the list of OGs where the protein exist at 4 taxonomic levels. - Empty fields are expected if no OGs are available from a taxonomic level. - If a protein exists in more than one OG (i.e. due to gene fusions), several OG names will be separated by commas. - If a protein does not belong to ANY known orthologous group, columns 2, 3, 4 and 5 will be filled with a dash. Examples: #UniprotAccession OGs_LUCA OGs_bacteria OGs_eukaryotes OGs_archaea C4G7Z8 COG2801 ENOG4105DQ6 C4GBP2 ENOG4111QY9 ENOG4105K1I C4G7Z7 ENOG4111SA5 ENOG4105GTZ C4GBP3 - - - - C4GA84 - - - - C4GDZ8 - - - - Q4RE88 ENOG410XQFD KOG0613,KOG0032 Q4SR43 COG0515 KOG3610,KOG1095 Q4T8P0 COG0480 KOG0462,KOG0467 NOTE: ======= Four separate files are also provided for each taxnomic level. Those file do only contain Uniprot Accessions with a match in EggNOG. Files are two columns tab-delimited files. Multi-group matching is expected for some entries. In that cases, matching groups are separated by commas in the second column. EggNOG linking ====================== - You can link to the default EggNOG view for a given protein id by using:[UniprotAccession] - To find protein in a specific OG, use:[UniprotAccession]&target_nogs=[NOGID] Examples: Note that LUCA is the default taxonomic level and contains sequences from the three kingdoms (mappings to the original COG groups are included at this level). However, more precise orthologs are provided if the taxonomic scope is restricted to any of other 3 levels Bacteria, Eukaryota and Archaea.