{"@type": "dcat:Dataset", "accessLevel": "public", "bureauCode": ["005:18"], "contactPoint": {"fn": "Sparks, Michael", "hasEmail": "mailto:Michael.Sparks2@USDA.GOV"}, "description": "<p>This dataset presents the <em>Halyomorpha halys</em> Official Gene Set (OGS) v1.2. OGSv1.2 is an update of <em>Halyomorpha halys</em> OGSv1.1 (<a href=\"https://doi.org/10.15482/USDA.ADC/1504240\">https://doi.org/10.15482/USDA.ADC/1504240</a>) to the coordinates of genome assembly GCA_000696795.3 (<a href=\"https://www.ncbi.nlm.nih.gov/assembly/GCA_000696795.3\">https://www.ncbi.nlm.nih.gov/assembly/GCA_000696795.3</a>) using <a href=\"https://github.com/NAL-i5K/coordinates_conversion/\">https://github.com/NAL-i5K/coordinates_conversion/</a>. </p>\n<p>The original OGSv1.0 is an integration of automatic gene predictions from NCBI's eukaryotic annotation pipeline, NCBI Halyomorpha halys Annotation Release 100 (<a href=\"https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Halyomorpha_halys/100/\">https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Halyomorpha_halys/100/</a>; ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/696/795/GCF_000696795.1_Hhal_1.0), with manual annotations by the research community (performed via the Apollo manual curation software, <a href=\"http://genomearchitect.org/\">http://genomearchitect.org/</a>). Manual annotations performed by the community were downloaded from Apollo, QC'd, and merged with NCBI Halyomorpha halys Annotation Release 100 using the GFF3toolkit software (<a href=\"https://github.com/NAL-i5K/GFF3toolkit/releases/tag/v1.4.4\">https://github.com/NAL-i5K/GFF3toolkit/releases/tag/v1.4.4</a>). The resulting merged dataset was formatted for ingest into the i5k Workspace and GenBank databases, resulting in <em>Halyomorpha halys</em> Official Gene Set (OGS) v1.0. </p>\n<p>Halyomorpha Official Gene Set halhal_OGSv1.1 is a minor update of halhal_OGSv1.0: Alias attributes were added to all manually annotated cathepsin models; six models from contaminated scaffolds were removed; and notes were added to 3 models located on possibly contaminated scaffolds. </p><div><br>Resources in this dataset:</div><br><ul><li><p>Resource Title: Halymorpha halys Official Gene Set OGSv1.2.</p> <p>File Name: halhal_OGSv1.2.tar.gz</p><p>Resource Description: The attached tar.gz archive (halhal_OGSv1.2.tar.gz) contains the following files:</p>\n<p>halhal_OGSv1.2.gff. Gff3 of all gene predictions of Halymorpha halys genome annotations OGSv1.2\nhalhal_OGSv1.2_CDS.fa. CDS sequences of Halymorpha halys genome annotations OGSv1.2\nhalhal_OGSv1.2_pep.fa. Amino acid sequences of Halymorpha halys genome annotations OGSv1.2\nhalhal_OGSv1.2_trans.fa. Transcript sequences of Halymorpha halys genome annotations OGSv1.2\nreadme. Readme file describing Halymorpha halys genome annotations OGSv1.2</p>\n<p></p></li></ul><p></p>", "distribution": [{"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44291273", "format": "gz", "mediaType": "application/x-gzip", "title": "halhal_OGSv1.2.tar.gz"}], "identifier": "10.15482/USDA.ADC/1518751", "keyword": ["genome assembly", "sequence analysis", "Halyomorpha halys", "genome annotation", "data.gov", "ARS"], "license": "https://creativecommons.org/licenses/by-sa/4.0/", "modified": "2025-11-21", "programCode": ["005:040"], "publisher": {"@type": "org:Organization", "name": "Agricultural Research Service"}, "spatial": "{\"type\": \"Polygon\", \"coordinates\": [[[-172.96875, -85.973919490277], [-172.96875, 85.513398309887], [194.0625, 85.513398309887], [194.0625, -85.973919490277], [-172.96875, -85.973919490277]]]}", "temporal": "2019-01-01/2019-01-01", "title": "Halyomorpha halys Official Gene Set v1.2"}