Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Try the next-generation Data Catalog at catalog-beta.data.gov and help shape it with your feedback.

In-house annotated gene set for the pecan weevil, <i>Curculio caryae</i>

Metadata Updated: April 1, 2026

This in-house annotated gene set was created using the following methods.

RNA was isolated from the head and thorax segments of one adult male and one adult female pecan weevil using the NucleoMag RNA Kit (Macherey-Nagel, Düren, Germany, 744350.1) according to kit protocols. Isolated RNA was processed into PacBio Kinnex sequencing libraries using the Iso-Seq express 2.0 kit (Pacific Biosciences, Menlo Park, CA, USA 103-071-500) and Kinnex full-length RNA kit (Pacific Biosciences, Menlo Park, CA, USA,103-072-000). The prepared library was bound and sequenced at the USDA-ARS Veterinary Pest Genetics Research Unit in Kerrville, Texas, on two Pacific Biosciences SMRT cell trays with a Revio system (Pacific Biosciences, Menlo Park, CA, USA, 102-202-200) beginning with a 2-h pre-extension followed by a 30-h movie collection time. After sequencing, circular consensus sequences from the PacBio Sequel Revio subreads were obtained using the SMRTLink v13.0 software. Reads were subsequently mapped to the repeat-masked genome assembly using minimap2 with arguments for spliced nucleotide sequences (-ax splice:hq) to generate sam mapping files. These were then compressed into bam files using samtools view -bS and used as input for gene model prediction with the Braker version 3.0.8 program (https://github.com/Gaius-Augustus/BRAKER), generating 72,879 gene models. These gene models and amino acid protein predictions were further curated and annotated with gene ontologies and protein domains using InterProScan-5.73-104.0 with PANTHER-19.0 and Pfam-37.2 databases (https://github.com/ebi-pf-team/interproscan), resulting in 19,508 InterProScan results.

Access & Use Information

Public: This dataset is intended for public access and use. License: Creative Commons CCZero

Downloads & Resources

Dates

Metadata Created Date January 6, 2026
Metadata Updated Date April 1, 2026

Metadata Source

Harvested from USDA JSON

Additional Metadata

Resource Type Dataset
Metadata Created Date January 6, 2026
Metadata Updated Date April 1, 2026
Publisher Agricultural Research Service
Maintainer
Identifier 10.15482/USDA.ADC/30234490.v1
Data Last Modified 2026-03-23
Public Access Level public
Bureau Code 005:18
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id 046a829f-dd6a-495d-aa57-a7ce0e803671
Harvest Source Id d3fafa34-0cb9-48f1-ab1d-5b5fdc783806
Harvest Source Title USDA JSON
License https://creativecommons.org/publicdomain/zero/1.0/
Program Code 005:040
Source Datajson Identifier True
Source Hash 0ac43059d18baddbfde4d4671ea7939e55015311dc6faf22af78ba7aefbaf287
Source Schema Version 1.1
Temporal 2023-09-19/2025-09-29

Didn't find what you're looking for? Suggest a dataset here.