Class MwSitesDumpFileProcessor

java.lang.Object
org.wikidata.wdtk.dumpfiles.MwSitesDumpFileProcessor
All Implemented Interfaces:
MwDumpFileProcessor

public class MwSitesDumpFileProcessor extends Object implements MwDumpFileProcessor
This class processes dump files that contain the SQL dump of the MediaWiki sites table.

The class expects all URLs in the dump to be protocol-relative (i.e., starting with "//" rather than with "http://" or "https://") and it will prepend "http:".

Author:
Markus Kroetzsch
  • Constructor Details

    • MwSitesDumpFileProcessor

      public MwSitesDumpFileProcessor()
  • Method Details

    • getSites

      public Sites getSites()
      Returns the information about sites that has been extracted from the dump file(s) processed earlier.
      Returns:
      the sites information
    • processDumpFileContents

      public void processDumpFileContents(InputStream inputStream, MwDumpFile dumpFile)
      Description copied from interface: MwDumpFileProcessor
      Process dump file data from the given input stream.

      The input stream is obtained from the given dump file via MwDumpFile.getDumpFileStream(). It will be closed by the caller.

      Specified by:
      processDumpFileContents in interface MwDumpFileProcessor
      Parameters:
      inputStream - to access the contents of the dump
      dumpFile - to access further information about this dump