1. Collect some texts. Compare them in a number of ways.
2. You will likely want to have them be "bags of words." Prepare the text through removing upper case, white space, punctuation, and consider stemming the words, if appropriate for you purpose.
3. Generate relative word frequencies for each bag of words, and compare them to each other.
4. Articulate what differences (if any) you notice and whether this comports with a theory of why these bags of words should be similar or different.
5. Run statistical tests of association between the bags of words (correlation, cosine similarity, regression or Chi-squared), and explain what they indicate.
6. Do one more big thing-- either a sentiment analysis of the bags of words; rerun your analysis but using bigrams and/or trigrams; consider the role of negation words ("not," "no", etc.) on your earlier analysis; run a parts of speech tagger; look at the temporal unfolding of your words; or do a topic modelling exercise. For whichever thing you choose, explain what you are doing and whatever you find makes sense in some way theoretically.
7. Do some word clouds of your texts.