The Authorship Report: What's included in the report and how to interpret the results
This guidance is for the older version of the Authorship Report. To see the new Report guidance, select the Visit Guides link from within the Report.
Turnitin Authorship does not detect contract cheating but investigates the authorship of a file using forensic linguistics and natural language processing. Potential evidence and useful indicators of contract cheating are presented in the form of an Authorship Report.
None of the results in the report are empirical evidence that contract cheating has taken place, but together, they indicate whether the investigation file warrants further scrutiny by the appropriate body within your institution. This tool should be used as one step in a larger process to combat contract cheating.
This guide will break down each section of the report, help you interpret the results, and also offer some potential false-flag scenarios.
Note: This guide covers all user types. If you are a user tenant, some of the features and sections (such as paper download and Similarity) may not be available.
The Side Panel
Navigation
To the left of the report, there is a side panel. From here, you can navigate through the report.
Create an allegation brief
From the side panel, you have the ability to create an allegation brief. This is a tailored document containing a collection of all the sections you feel are relevant to your investigation and your investigation summary comments. It also only contains files that you have specified.
To add a section of the report to the allegation brief, select the pin icon in the top right of that section.
|
|
Once selected the pin will turn blue, and the section will be included in the allegation brief. Select the pin again to remove the section.
To open the allegation brief, select the Create a Brief option in the bottom left menu. The number of sections you have pinned to the brief will be shown next to the text.
The allegation brief options will appear in a panel on the right of the report.
You can remove sections by selecting the X next to them. You have the option to include the investigation comments from the review panel.
Choose which files you want to include in the brief by attaching signifying labels to them and then selecting to include them in the allegation brief panel.
When you are happy with what has been included in the allegation brief, select the Download brief button.
Review
Below the report sections, you will be able to access the review panel. The review panel will appear on the right side of the report. From here you can set an Investigation Status from the drop-down. The options are Escalate Investigation, Inconclusive, Dismiss Investigation. The report will remember the status you have chosen and it will be viewable for anyone you share the report with.
The status you set here will also be visible from the within the My Reports area.
Below the Investigation Status, you can leave a summary comment to provide additional information/thoughts about the report. Again, this comment will be visible to anyone you share the report with.
Please note, when you share the report with other investigators and admins, they will be able to edit both the report status and the summary comment. The last person to edit the report, and the time and date they did so, will be visible at the bottom of the review panel.
Download the report
A pdf of the report can be printed and used in further escalation of an academic misconduct investigation.
Select the Download option in the side navigation to open your devices’ download manager. This may take a few moments.
The PDF of the report will contain each section and then a glossary at the end.
Help
The help section is available from the bottom of the side panel.
This section provides links to the community forum, the Authorship Investigate guidance, an opportunity to view the report tour that activates the first time you open the report, and a feedback option.
To hide the side navigation, select the arrows in the bottom right of the navigation.
The Investigation Recommendation
Authorship Investigate will attribute an investigation recommendation to each report. This will appear at the top of the report. Please note that this recommendation is based on the Turnitin authorship machine learning algorithm, which utilizes hundreds of linguistic features, to predict whether a student’s writing behavior warrants additional investigation. If the prediction engine shows ‘Investigation Recommended’, that is not proof of academic misconduct, nor does an ‘Investigation not recommended’ prediction exonerate a file. Whatever the recommendation, your intuition is important and further scrutiny should be applied to each section if you feel necessary.
The recommendation engine is an evolving tool that utilizes machine learning. Your use of Authorship Investigation is actively contributing to the improvement and accuracy of the Authorship Report prediction.
File details
If you have used the submission ID upload method, then this section will include the class, assignment, filename, and date submitted of each paper. If you are an administrator or investigator, you will also have the ability to download all the files, or each file individually.
The File Details section is the first place in the report where you will be able to attach a signifying label to each paper. This will allow you to easily identify certain files and track them throughout report results.
To open the label assignment menu for a file, select the file symbol to the left of the filename in any of the tables in the report, or on any of the charts.
From the menu, you will be able to attribute one of four options to that file. These options are Suspicious, Follow-up, Dismiss, and No Label.
As a default, the investigation file will always be tagged as Suspicious and the comparison files will be tagged as No Label.
When you attribute a label to a file, it will update that file throughout all the sections of the report regardless of the area you set it from.
Warning: If you have used the paper ID upload method, attributing a signifying symbol to a file will attribute it to this file across all reports. If you have files that are being used in multiple investigations, please be aware that the symbols can be changed from any report that that file features in.
Readability
Readability is calculated using one of the Flesch-Kincaid readability tests. These tests are designed to indicate how difficult an English passage is to understand. There are two tests: the Flesch Reading Ease and the Flesch–Kincaid Grade Level. Although they use the same core measures (word length and sentence length), they have different weighting factors.
Assuming the text is grammatically correct, the Authorship Report calculates readability using Flesch–Kincaid Grade Level and translates that into the number of years of education generally required to understand this text. It is important to note that this is not the grade level of the author. This formula can be used by anyone to calculate the grade level of a piece of writing. The Authorship Report will save you time by calculating the grade level for you.
The readability score is displayed as a time-series with years of education displayed on the y-axis and submission dates or file creation date along the x-axis. The investigation file is visualized in red by default, while the comparison files are visualized in gray by default.
By showing the files in order of submission or file creation you can more accurately see behavioral trends in the data.
If you have created your report using the paper ID method, the submission date will be used as the display point. If you have used file upload, the file creation data found in the document metadata will be used.
Warning: File creation date can only be found in .docx metadata. If you are using the file upload method and upload a file type other than .docx, this data will not be displayed.
If at any point you would like to view the data shown in a time-series as a table, select Show table.
This will display the tabular data from the time-series. Select Hide table to hide the table.
What am I looking for?
A student’s writing should improve over time, but look for files that have dramatically different scores. Contract cheated papers can be purchased at various levels. Cheap papers will often be poorly written and would return a low readability score. Alternatively, if a student wanted a better grade and purchased an expensive paper, the readability score could be much higher than their usual standard of writing.
Things to consider
An author’s readability score should improve over time. This should be taken into consideration if any of the comparison file(s) are older examples of an author’s work.
Variation in subject matter and/or assignment length can also lead to a swing in ‘Readability’ results.
Document Information
Document metadata is often used by investigators of academic misconduct as key evidence in their cases. The Authorship Report will gather this information quickly.
Files that are .docx can pull the most complete information.
If the metadata cannot be found for a file, then that file will be grey and italicized.
Author Name
The author is the name given to the file creator. The results will be displayed in a table that will allow you to view the filename, author, and who the file was last modified by.
Filename and Author can be retrieved from .docx and .pdf files. Last modified by can be retrieved from .docx only.
The results can be ordered alphabetically (or reverse alphabetically) by selecting the column titles.
Dates
The report will collect dates that are pertinent to the investigation file and comparison files. The results will be displayed in a table that will allow you to see the filename, the date that the file was created, and the date the file was last modified. Dates can be retrieved from .docx and .pdf files.
The results can be ordered chronologically (or reverse chronologically) by selecting the column titles.
Page Sizes
Page size can reveal the origin of a document. There are three types of results:
-
US Letter
-
A4
-
Other
US Letter is the paper standard in the United States, while A4 is standard in other countries. Files listed as Other either use a different page size than US Letter or A4, or use a combination of page sizes.
Page sizes can be retrieved from .doc, .docx, .pdf, and .rtf files.
Editing Time
This visual scale and table will display the total time spent editing the file. This is the amount of time spent with the document open and in front of other windows, whether the author is typing or not. This time is saved and added up each time you save your changes. Editing times can be retrieved from .docx files.
The results can be ordered alphabetically (or reverse alphabetically) by filename, and by length by selecting the column titles.
Revisions
This visual scale and table will display how many times the file has been revised (opened and changes made). Revisions can be retrieved from .docx files.
The results can be ordered alphabetically (or reverse alphabetically) by filename, and by the number of revisions by selecting the column titles.
Font
Stylistic decisions such as font usage can often be revealing. Look for different fonts in the same paper, subtle differences in font size and color, and instances where one paper contains a different font to the rest.
Font data includes:
-
the font families (for example, Times New Roman, Arial, etc.) used in a file
-
the font sizes used in a file
-
the colors used in a file
The report will show the percentage usage of each unique family, size, and color combination.
Fonts can be retrieved from .docx files.
Images
The report can pull in image data from .docx files. There are two views available. The images Overview will show how many images were contained in each file.
If certain requirements are met, the report will be able to pull image paths from .docx files. This information will be available in the Full Image List.
Software and Operating System
This section pulls the software used to create the file, the software version, and, if possible, the Operating System (OS).
What am I looking for?
In all the results shown in the Document Information section, the main things to look for are inconsistencies. This section often contains the most incriminating evidence in the report.
Author Name
If a name belonging to anyone other than the supposed author is shown in these results, further investigation is recommended.
Note: In .docx files the Author Name is the name of the license holder. This could be a parent, peer, or institution.
Dates
Short turnovers in between date created and date last modified on lengthier files should be noted as they indicated the copying and pasting of content. This section can also reveal about the methodology of a student; do they generally create a file a few months or weeks before a due date? Does the data in this section match with the Editing Time and Revision data?
Page Sizes
If you are based in the UK or Australia, the standard paper size is A4. If you are based in the US, the standard paper size is US Letter. It would be unusual if a file was not the standard paper size for your region. It is worth investigating all paper size inconsistencies.
Editing Time
Short editing times on files are suspicious. This indicates the bulk of the content has been copied into the file. Editing Time will also allow you to observe the methodology of the author. How long do they usually take to craft their work? Compare this data to Dates and Revisions to find out more about the author’s methodology.
If the author has copied content from a document into a new document, or the author has used the ‘Save as…’ functionality to create a fresh copy of a document, the editing time would be reset to '0' in the new document.
Note: This feature may be disabled in some countries for privacy reasons; if this is the case, the value will always be shown as '0'
Revisions
Few revisions on files are suspicious. Use this feature to examine a student’s methodology, but usually, you should see multiple revisions for a file. If a file hasn’t had many revisions (less than 5) it is an indication that the content has been copied and pasted in from another file and should be investigated further.
Font
Use this feature to cross-reference the stylistic habits of a student’s work. A sudden shift in font use should be examined further. For example, does a student use size 14 Arial font for four essays, and then size 12 Times New Roman for another? Why have they changed?
Font data can also be incredibly useful for uncovering other instances of academic misconduct.
A common mispractise is to hide content from markers by setting the font color to white. This would increase the word count, which would impact the files’ similarity score as well as the various natural language processing features Authorship Investigate looks for. Any white text in a file should always be investigated further.
Subtle differences in font size and color can also indicate the copy and pasting of content. For example, a student may write an essay in black size 14 font. They then may copy a large portion of text from a different file that is dark grey size 13.5 and paste this content into their essay. The small difference in size and color may go unnoticed to the naked eye but will be apparent in the Font section of the Authorship Report.
Images
Use this feature to compare image conventions of an author. Does only one of the files submitted contain images? Or does a paper contain an excessive amount compared to the others? This information should be used as a vehicle to further examine the papers in question.
If certain requirements are met, the report will be able to pull image paths from .docx files. This information will be available in the Full Image List. If this data is available, examine the image paths from each file for unusual or suspicious results. If the images have been added to the document from a computer other than the students, this will be visible in the file path.
Compare these results with the rest of the document metadata to corroborate the results.
Software and Operating System
Changes in software, software version, and OS between documents would imply that the files have been authored on different computers. If the student claims they have only used one computer for all of their work, this data can be incriminating.
Things to consider
An author may use a blank file created by an instructor or peer as the basis for their document. Or they may have worked on the essay using the computer belonging to a friend, peer, or library. Another possibility is that they may have asked a peer to proofread and spell check a document. All these scenarios would lead to modifications by someone other than the author and a variance in the ‘Author Name’ section. When talking to the student, request that they present their writing process before presenting the document information discrepancies.
If the author has copied content from a document into a new document, or the author has used the ‘Save as…’ functionality to create a fresh copy of a document, the editing time would be reset to '0' in the new document, along with the dates. This could explain short results in ‘Editing Time’ and ‘Dates’.
Similarity
The Similarity section displays the similarity score of each file uploaded. This section is only available in reports that have been created using the paper ID upload method.
Note: The similarity score will be impacted by the settings you have set in Turnitin. Check these settings to make sure the similarity scores are accurate.
The time-series chart allows you to see the dates on which the file was first submitted along with the similarity score.
What am I looking for?
Look out for files with high similarity scores, as this may imply that the content of the paper has not been written by the student.
Also, look out for files with 0%, as essay mills/contract cheating companies will write essays with the aim to have a low similarity score. These papers should be investigated further to see if they have used sources and references.
Things to consider
Files with <80% similarity scores imply the content of that file has plagiarised and therefore not written by the student. Consider discounting these files as the results for these files will not accurately reflect the student under investigation’s writing.
Language
Spelling
There are three views available for sorting spelling:
-
Document View
-
Word View
-
Pattern View
The Document View is the view that previously available. It shows all the classifiable words found in each document. These are words that have two potential spellings, British or American. If found, the report will display a visual spread of the spelling variations on these words throughout the files.
Select the drop-down arrow to the left of the filename to see a full list of the classified words found in that file. The report will display both American and British spelling and show the percentage of usage for each spelling type.
Note: Both spelling types are shown, even if the file only contains one type.
The Word View will display the classifiable words found throughout all submissions.
The Pattern View will display any classified word patterns found throughout all submissions. The report can identify the following word patterns:
-
ER vs. -RE
-
OR vs. -OUR
-
SE vs. -CE
-
IZE vs -ISE
-
Scientific American vs Scientific British
-
Single consonant vs. Double consonant
-
No silent e/ue vs. Silent e/ue
-
Other US vs. Other UK
Keywords
The Contract cheating keywords section identifies words and phrases that are known to have been found in contract cheated papers. Often a purchased paper will contain prompts to fill in certain details that a student would inadvertently leave in for submission. We look for the following words:
|
|
There are three views for keywords. The first is the Document View, this will show you how many keywords have been found in each file. Invoke the dropdown by each document to see which words have been found.
The Word View will show each of the keywords found throughout the files. Invoke the dropdown by the keyword to see which documents the keyword was found in.
The Detailed View will show a complete list of both the file name and the words found throughout all files.
Vocabulary Richness
Vocabulary richness (Hapax Legomena ratio) calculates the percentage of words in a document that only occur once.
For example, the following sentence contains 8 words with 4 words occurring only once, resulting in a vocabulary richness score of 50%: “The white cat sat on the white mat”
What am I looking for?
Spelling
Look for files that contain spelling variants different to the majority, and single files that contain both spelling variants.
Examine the files that contain variants to see if the discrepancies are the result of quotations.
Keywords
Experts at detecting purchased papers have told us that finding any of the contract cheating keywords in a file is often strong evidence in an academic misconduct case. If the Authorship Report finds any of the contract cheating keywords in the uploaded files, we recommend you view the file in question to see the context of the keywords. Does the file that the keyword(s) appear to have a different look and feel to the rest of the submitted files?
It is worth noting that essay templates may also contain these keywords words.
Vocabulary Richness
It is important to remember the different levels of writing that can be purchased when considering Vocabulary Richness results. Look for files that have dramatically different results and then examine the files further. Files with shorter word counts or drastically different subject matter can create outlier scores.
Things to consider
If a student has used a spell-checking software then they could inadvertently use a spelling variation other than their expected one. And if a student has taken a blank assignment template then they could have left in some prompts by mistake which would appear as contract cheating keywords.
When discussing the assignment with the student, find out from the outset whether a spell checker software or template has been used.
Sentences
How an individual crafts their sentences is often a stylistic feature that is common in all of their writing. The results for each of the sentence features will differ from file to file, but files by the same author should have relatively similar results.
Sentence type
The report will show how each paper has utilized the different sentence structure types. There are four main sentence structures: simple, compound, complex, and compound-complex.
- A simple sentence contains one independent clause; for example, “I like cats.”
- A compound sentence contains two or more independent clauses; for example, “I like cats, but my friend likes dogs.”
- A complex sentence contains one independent clause and one or more dependant clauses; for example, “The cat ran inside because it was raining.”
- A compound-complex sentence contains two or more independent clauses and one or more dependent clauses; for example, “The cat ran inside because it was raining, and it hates getting wet.”
If a sentence does not fall into one of these types, the report will list it as "other". For example bullet points, tables, or image captions.
This offers a visual representation of how the student has used sentence type across each of their submissions. It will give you an idea of the complexity of their writing.
Average sentence length
Average sentence length is the average number of words per sentence in a document.
Phrases per sentence
Phrases per sentence is the average number of phrases per sentence in a document. Phrases are sets of words that form a single grammatical piece of a sentence.
There are several types of phrases: noun phrase, verb phrase, adjective phrase, prepositional phrase, adverb phrase, etc. The phrases per sentence score is calculated using top-level phrases; that is, phrases that are not nested inside another phrase.
For example, the sentence "The cat sat on the mat" has two top-level phrases; a noun phrase (The cat) and a verb phrase (sat on the mat). Therefore, in a document containing 100 sentences split up into 200 total phrases, the phrases per sentence would be 2.0.
What am I looking for?
It is important to remember the different levels of writing that can be purchased when considering all the results in the Sentences section.
Sentence type
How we use sentence type is based on our writing level and is habitual. The visualization in this section should all roughly have the same shape. If any of the visualizations look different to the norm of the others, this file should be investigated further.
Average sentence length
Look for patterns in how long the author makes their sentences. Do they use short succinct sentences, or long, drawn-out ones? Does the file under investigation conform to this habit? If not, the file should be investigated further.
Phrases per sentence
The amount of phrases we use in a sentence is habitual and usually restrained by our writing level. Look for outlying results in this section. The phrases per sentence should be similar throughout work written by the same individual. Any outliers should be investigated further.
Things to consider
The data available in the sentence section should improve over time. This should be taken into consideration if any of the comparison file(s) are older examples of an author’s work.
Variation in subject matter and/or assignment length can also lead to a swing in ‘Sentence’ results.
References
The Authorship Report identifies the style used for in-text citations. A citation style, sometimes called a reference style, is a set of standards on how to refer to your sources in academic writing. The Authorship Report can identify the following citation styles: APA, MLA, and Turabian. Any other citation style detected will be displayed as Other.
The citation style shown next to the filename indicates the style used most frequently in the file.
Select the dropdown arrow to see the full list of styles found and the buffer text of the citation.
What am I looking for?
Different academic disciplines often follow their own referencing standards, but universities and colleges may also request all disciplines in the institution to follow one style. Look for variances in citation styles, but also take time to examine the paper to see how each citation is presented. Paid essays may often be more meticulous in their references than an inexperienced student.
Things to consider
A student may have been instructed by a professor or recommended by a peer to use a certain style guide on different assignments. Alternatively, the student could have a poor understanding of referencing, and the discrepancies come down to a need to be educated better on references and how to use them correctly.
Viewing the files
Throughout the report selecting the filename of each file will open it in a document viewer. Viewing the actual text of the files can often be an enlightening process, and we recommend referring to the files throughout your investigation.
To aid this process, the document viewer has a side panel that will contain aspects of the report so you can corroborate the results of the report with the file text.
The side panel contains three sections:
-
Document Information.
This section contains document metadata, page size, and font data. -
Images
This section contains the number of images contained and possibly the image file paths if they can be found. -
Language
This section contains spelling data and contract cheating keywords.
Paper navigation in the viewer
Investigators can now navigate between files in the viewer. Open a file by selecting the file name anywhere in the report. This will launch the file in the viewer.
To navigate between papers, select the arrows next to ‘Submission x of x’.
Viewing the actual text of the files can often be an enlightening process, and we recommend referring to the files throughout your investigation. Look for differences in layout and formatting.