bitsgalore.org

follow: @[email protected]

Posts

Writerperfect conversion tools for legacy file formats

Quattro Pro for DOS revisited: an obsolete format no more?

Emulating Microsoft Multiplan spreadsheets in DOSBox-X

Changes to the blog: migration to Codeberg and ActivityPub-based comments

Y2K

PDF Quality assessment for digitisation batches with Python, PyMuPDF and Pillow

Escape from the phantom of the PDF

JPEG quality estimation using simple least squares matching of quantization tables

JPEG quality estimation: experiments with a modified ImageMagick heuristic

Multi-image TIFFs, subfiles and image file directories

VeraPDF parse status as a proxy for PDF rendering: experiments with the Synthetic PDF Testset

Identification of PDF preservation risks with VeraPDF and JHOVE

Extracting text from EPUB files in Python

Moving my Internet domains

Writing yet another workflow tool for imaging portable media

How to preserve your personal Twitter archive

Wheel Out the Digital Dark Age Klaxon!

Identification of physical storage media and devices with Python and the Windows API

Introducing Isolyzer 1.4

Generating lossy access JP2s from lossless preservation masters

On The Significant Properties of Spreadsheets

PDF processing and analysis with open-source tools

Towards a preservation workflow for mobile apps

Four Android emulators, two apps

Mapping the Dutch web domain

Restoring Liesbet's Virtual Home, a digital treasure from the early Dutch web

ISO/IEC TS 22424 standard on EPUB3 preservation

Does Microsoft OneDrive export large ZIP files that are corrupt?

Offline digital data carriers in the KB deposit collection

Web domain geolocation and spatial analysis with QGIS

Recovering '90s Data Tapes - Experiences From the KB Web Archaeology project (iPres 2019 paper)

A simple disk imaging workflow tool

A simple workflow tool for imaging optical media using readom and ddrescue

Roll the tape - recovering '90s data tapes in BitCurator

Crawling offline web content: the NL-menu case

Resurrecting the first Dutch web index: NL-menu revisited

Update on Isolyzer: UDF, HFS+ and more!

Image and Rip Optical Media Like A Boss!

Policy-based assessment with VeraPDF - a first impression

Imaging CD-Extra / Blue Book discs

Detecting broken ISO images: introducing Isolyzer

Breaking WAVEs (and some FLACs too)

PDF/A as a preferred, sustainable format for spreadsheets?

Valid, but not accessible: crazy fixed EPUB layouts

The future of EPUB? A first look at the EPUB 3.1 Editor’s draft

Jpylyzer 2015 round-up

Preserving optical media from the command-line

Response to report on JPEG 2000 expert round table

Why PDF/A validation matters, even if you don't have PDF/A - Part 2

Why PDF/A validation matters, even if you don't have PDF/A

Top 50 file formats in the KB e-Depot

Policy-based assessment of EPUB with Epubcheck

Dutch newspaper wipes out articles citing fabricated sources - Internet Archive to the rescue!

Perdiep Ramesar in het Internet Archive

Demise of the Dutch Blogosphere

Quattro Pro for DOS: an obsolete format at last?

Running archived Android apps on a PC: first impressions

Six ways to decode a lossy JP2

Jpylyzer software finalist voor digitale duurzaamheidsprijs

When (not) to migrate a PDF to PDF/A

How to save a web page to the Internet Archive

Why can't we have digital preservation tools that just work?

Identification of PDF preservation risks: analysis of Govdocs selected corpus

Measuring Bigfoot

Assessing file format risks: searching for Bigfoot?

Optimising archival JP2s for the derivation of access copies

Identification of PDF preservation risks with Apache Preflight: the sequel

ICC profiles and resolution in JP2: update on 2011 D-Lib paper

EPUB for archival preservation: an update

Adventures in Debian packaging

What do we mean by "embedded" files in PDF?

Identification of PDF preservation risks with Apache Preflight: a first impression

Automated assessment of JP2 against a technical profile

Magic editing and creation: a primer

PDF – Inventory of long-term preservation risks

EPUB for archival preservation

Update on jpylyzer

Jpylyzer documentation

A prototype JP2 validator and properties extractor

Evaluation of identification tools: first results from SCAPE

A simple JP2 file structure checker

Improved identification of XML: a Python experiment

Paper on JPEG 2000 for preservation

Ensuring the suitability of JPEG 2000 for preservation