Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Try the next-generation Data Catalog at catalog-beta.data.gov and help shape it with your feedback.

TREC 2001 CROSS LANGUAGE DATASET

Metadata Updated: March 14, 2025

Ten groups participated in the TREC-2001 cross-language information retrieval track, which focussed on retrieving Arabic language documents based on 25 queries that were originally prepared in English. French and Arabic translations of the queries were also available. This was the first year in which a large Arabic test collection was available, so a variety of approaches were tried and a rich set of experiments performed using resources such as machine translation, parallel corpora, several approaches to stemming and/or morphology, and both pre-translation and post-translation blind relevance feedback. On average, forty percent of the relevant documents discovered by a participating team were found by no other team, a higher rate than normally observed at TREC. This raises some concern that the relevance judgment pools may be less complete than has historically been the case.

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources

References

http://trec.nist.gov/pubs/trec10/papers/clirtrack.pdf

Dates

Metadata Created Date March 14, 2025
Metadata Updated Date March 14, 2025

Metadata Source

Harvested from NIST

Additional Metadata

Resource Type Dataset
Metadata Created Date March 14, 2025
Metadata Updated Date March 14, 2025
Publisher National Institute of Standards and Technology
Maintainer
Identifier ark:/88434/mds2-3588
Data First Published 2024-11-22
Language en
Data Last Modified 2024-10-02 00:00:00
Category Information Technology
Public Access Level public
Bureau Code 006:55
Metadata Context https://project-open-data.cio.gov/v1.1/schema/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id 1f15eb26-f1c8-4a3d-b981-80f46f88f17e
Harvest Source Id 74e175d9-66b3-4323-ac98-e2a90eeb93c0
Harvest Source Title NIST
Homepage URL https://data.nist.gov/od/id/mds2-3588
License https://www.nist.gov/open/license
Program Code 006:045
Related Documents http://trec.nist.gov/pubs/trec10/papers/clirtrack.pdf
Source Datajson Identifier True
Source Hash ed9a87264d5fecf86f8c452ddb81950f48117745cda34944a3c2920a5a95ca3d
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.