How to convert document from Google Docs to text file

How would I capture all the text in a document from Google Docs and convert it to a text file, preferably a way that can be used in a script? Would wget work? such as:

wget > googledoc.txt

If so, would I be able to use a shortened URL?

2 Answers

No need to pipe to other program to convert the file. You can download from Google Docs in any supported format, by using the existing parameters in the URL address.

where:

  • FILE_ID is string ID of target file and;
  • FORMAT is file format of choice i.e. txt

Then, downloading the document from Google Docs as text file is straightforward by using wget or a web browser. Both methods will download the document as text file as expected.

I have tried myself and the output looks something like this:

$ wget
--####-##-## ##:##:##--
Resolving docs.google.com (docs.google.com)...
Connecting to docs.google.com (docs.google.com)... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘export?format=txt’ [ <=> ] 649 --.-K/s in 0s
####-##-## ##:##:## (##.# MB/s) - ‘export?format=txt’ saved [649]

The URL address for other products such as Google Sheets, Google Presentation or even Google Drive would be slightly different.

In terms of documentation, the only relevant guide I found was this dated blog post circa 2014. There is this page of developer guide for Google Drive but not useful as it is. That is all.

Download the Google Doc as a word document with the file extension .docx. Make sure you have the docxtxt package already installed. Then run the docx2txt command followed by the name of your file. For example...

docx2txt report.docx

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like