Package org.wikidata.wdtk.dumpfiles
Class WikibaseRevisionProcessor
java.lang.Object
org.wikidata.wdtk.dumpfiles.WikibaseRevisionProcessor
- All Implemented Interfaces:
MwRevisionProcessor
A revision processor that processes Wikibase entity content from a dump file.
Revisions are parsed to obtain EntityDocument objects.
- Author:
- Markus Kroetzsch
-
Constructor Summary
ConstructorDescriptionWikibaseRevisionProcessor
(EntityDocumentProcessor entityDocumentProcessor, String siteIri) Constructor. -
Method Summary
Modifier and TypeMethodDescriptionvoid
Performs final actions that should be done after all revisions in a batch of revisions have been processed.void
processItemRevision
(MwRevision mwRevision) void
processPropertyRevision
(MwRevision mwRevision) void
processRevision
(MwRevision mwRevision) Process the given MediaWiki revision.void
Initialises the revision processor for processing revisions.
-
Constructor Details
-
WikibaseRevisionProcessor
Constructor.- Parameters:
entityDocumentProcessor
- the object that entity documents will be forwarded tositeIri
- the IRI of the site that the data comes from, as used inEntityIdValue.getSiteIri()
-
-
Method Details
-
startRevisionProcessing
public void startRevisionProcessing(String siteName, String baseUrl, Map<Integer, String> namespaces) Description copied from interface:MwRevisionProcessor
Initialises the revision processor for processing revisions. General information about the configuration of the site for which revisions are being processed is provided.- Specified by:
startRevisionProcessing
in interfaceMwRevisionProcessor
- Parameters:
siteName
- the name of the sitebaseUrl
- the base URL of the sitenamespaces
- map from integer namespace ids to namespace prefixes; namespace strings do not include the final ":" used in MediaWiki to separate namespace prefixes from article titles, and the prefixes use spaces, not underscores as in MediaWiki URLs.
-
processRevision
Description copied from interface:MwRevisionProcessor
Process the given MediaWiki revision.- Specified by:
processRevision
in interfaceMwRevisionProcessor
- Parameters:
mwRevision
- the revision to process
-
processItemRevision
-
processPropertyRevision
-
finishRevisionProcessing
public void finishRevisionProcessing()Description copied from interface:MwRevisionProcessor
Performs final actions that should be done after all revisions in a batch of revisions have been processed. This is usually called after a whole dumpfile is completely processed.It is important to understand that this method might be called many times during one processing run. Its main purpose is to signal the completion of one file, not of the whole processing. This is used only to manage the control flow of revision processing (e.g., to be sure that the most recent revision of a page has certainly been found). This method must not be used to do things that should happen at the very end of a run, such as writing a file with results.
- Specified by:
finishRevisionProcessing
in interfaceMwRevisionProcessor
-