Description
Digitized Official Publications
Overview
This dataset contains digitized official publications hosted by the Swiss Federal Archives, covering publications from 1798-2021. The data was extracted from amtsdruckschriften.bar.admin.ch using web scraping methods.
Dataset Structure
The dataset is organized as follows:
- Each publication is contained in a separate folder within the ZIP file
- Each folder contains:
- PDF file of the original document
- CSV file with metadata
- TXT file containing OCR-extracted text
- Root directory contains:
metadata_ads.csv: Comprehensive metadata for all documents- List of missing files from the extraction process
Available Collections
The platform hosts the following collections (π’ indicates included in this dataset):
Federal Assembly (Bulletin, Minutes, Finding Aids)
- π’ Official Bulletin of the Federal Assembly (1891-1999) - 46,916 documents
- π’ Federal Assembly Minutes (1921-1970) - 7,126 documents
- π’ Directory of Proceedings (1848-1891) - 2 documents
- π’ Overview of Proceedings (1891-1995) - 842 documents
Federal Law
- π’ Federal Gazette (1849-2008) - 58,988 documents
- π’ Consolidated Collection (1848-1947) - 65 documents
- β Official Compilation (1948-1998)
- π’ Administrative Practice of Federal Authorities (1987-2017) - 2,502 documents
Federal Council
- π’ Federal Council Minutes (1848-1973) - 18,255 documents
- π’ Federal Council Management Reports (1848-1995) - 443 documents
Other Collections
- π’ Federal Calendar (1849-2021) - 4,254 documents
- β Federal State Account and Budget (1848-2006)
- β Diplomatic Documents of Switzerland (1848-1946)
- β Studies and Sources (1975-2005)
- β Collection of Acts from the Helvetic Republic Period (1798-1803)
Document Identification
Each document is identified by an eight-digit ID (pdfnum/ais_id) that combines:
- Publication type identifier (first digits)
- Running number within the publication type
Technical Details
- Format: ZIP archive containing folders with PDF, CSV, and TXT files
- OCR: All text documents have been processed using OCR technology
- Metadata: Structured in CSV format
- Time span: 1798-2021 (varies by collection)
- Languages: German, French, Italian (original documents in their respective languages)
Usage Notes
- The dataset is intended for research and reference purposes
- Some documents may be missing from the original collection
- OCR quality may vary depending on the original document condition
- Metadata includes additional contextual information about each document
Additional information
- Identifier
- ads-zip
- Issued date
- February 25, 2025
- Modified date
- -
- Languages
- Language independent
- Access URL
- https://sfa-laboratory.ch/data/ads/ADS.zip
- Download URL
- https://sfa-laboratory.ch/data/ads/ADS.zip
- File size
- 90.6 GB
- Format
- ZIP
- Documentation
-