Scribd Unveils Text Matching System Beta for Copyright Holders

UPDATE: The Scribd Text Matching System is now a component of our Qualified Publisher Program. Click here for details.

The Scribd Text Matching System (TMS), nearly a year in development, is a first-of-its-kind semantic filtering system that's designed to help protect the copyrights and intellectual property of content providers. Scribd TMS builds a "map" of copyrighted texts uploaded to our secure database, and then compares all uploads to Scribd against the semantic maps. If there's a significant match, the upload is denied, and the uploader is provided instructions on how to redress false positives. The TMS will become more precise as false positives are weeded out.

We are now accepting enrollments for authors and publishers. In order to enroll in the TMS, you must first create a normal Scribd account at, if you don't already have one.

  1. Log in to your Scribd account
  2. Point your browser to
  3. Fully complete all required fields of the sign up form. Please include a basic description of the type of content uploaded and an approximate number of documents that you plan to include.
  4. When finished, click Submit.
  5. A Scribd representative will send the official end-user agreement to the email address specified at signjup. Complete and sign the end-user agreement. The agreement must be signed by the original copyright holder.
  6. Fax the signed agreement(s) to 650.745.0703 (we will also accept scanned PDFs by email to copyright at scribd dot com.

Once approved, you can head on over to the Copyright dashboard at From there, you can click the links to upload documents to the copyright database, or review the ones you've already uploaded. The information is very basic, and you will not be able to read the text of the documents as you normally would on Scribd. Filtering begins once the document is analyzed and parsed for semantic matching data. The time it takes to parse documents depends on the length of the document and the clarity/quality of the text, so filtering may not begin for several minutes to several hours after upload.

We've focused on functionality instead of appearance for the beta release, so the system is still pretty rough around the edges.We've completed a couple weeks of testing based on content that had already been flagged for removal, and the success rate is remarkably high.

We welcome all publishers and authors to try this free service.