Is there a WORKING way to rip from Scribd? In short yes...and no.
Here's how *I* have been able to do it--so far...
And it appears to be working fiiiiine...?????
AFAIK, anyway...? Tested on an Android phone, testing on android tablet later, as always, your device/mileage may vary lol.
This WILL require the installation of ONE .apk, a file viewer which I will return to in a minute.
First, what you're going to want to do--if you have reactivated your subscription to Scribd, that is--
(If not, hit up some poor sod like me that's paying
lol!) Is go to the page of the book you want to read/keep. Say, here:
https://www.scribd.com/read/202741013/Wanderling-s-Choice#
THREE THINGS:
ONE: you will PROBABLY, in fact MOST LIKELY (not 100% certain but I already have it so the point is moot on my end) need to have the Scribd app itself in case you want to hit the "real" Download button
TWO: MAKE SURE the page link says "read" NOT "book" UNLESS YOU HAVE NOT ALREADY DOWNLOADED IT OFFLINE YET
THREE: This method will work ONLY with the program/s specified, exactly as linked.
IF you are not on Android, I cannot help you lol!
Now, once you've gone to the "read" page, it SHOULD pull you up the file you are attempting to--here's the surprise!--read.
Go into your browser's menu (in my Chrome for Android it's the three dots in the upper-right-hand corner, again, ymmv)
HERE'S THE KEY: PRESS THE "DOWNLOAD" BUTTON ((android it's the underlined arrow)
With luck, the page SHOULD say "downloading".
NOW, once that file has been downloaded--with any luck and all good fortune it should work that way!--
Go into your "Downloads" folder on your device and you should see a file labeled (thisismybooktitle).MHTML
We want this. This is good. DO NOT ATTEMPT TO RELOCATE THE FILE YET, OR POSSIBLY EVER (lol).
No, but the first time I moved the .mhtml into a dedicated folder and tried to open it the viewer told me it didn't exist--if it does that, redownload it and move it a second time. Technology is dumb.
Install THIS VIEWER:
https://www.mobileaction.co/app/android ... htmlviewer
I have tried AT LEAST three others (Adobe Acrobat, Moon+ Reader Pro and another mht viewer), NONE of which did the thing.
Now, open viewer, open file IN viewer (might take a couple of mashes to get it to open use the "triple-stack" icon in the top-left corner, but rest assured the two I tried so far have worked fine!)
Read offline, profit!
If it should be necessary download and run the Scribd app first to get the data the initial .mhtml is based on.
OR, if anybody knows a way to WRANGLE that .mhtml data into some kind of "preservable" ebook file (.i.e. .epub, .mobi, .pdf), SEND HELP, PLZ!
Talk about anything here as long as it is not against the rules. Post count not affected.
- Posts 1830
- WRZ$
276.60
- Device Kindle Fire
- OS iOS6
- Posts 10
- WRZ$
50.90
- Device Not Specified
- OS iOS6
https://products.groupdocs.app/conversion/mht-to-pdf
Does this work?This converts Mhtml to pdf.
https://www.quora.com/How-do-you-download-a-book-from-Scribd?share=1
Another solution on Quora, don't know if it works though.
https://userscripts-mirror.org/scripts/show/160374 Userscript for scribd download. Use it through Greasemonkey.
https://www.reddit.com/r/Piracy/comments/anavw4/methods_for_downloading_book_for_scribd/?utm_source=amp&utm_medium=&utm_content=comments_view_all
This Reddit thread also mentions a few methods.
Does this work?This converts Mhtml to pdf.
https://www.quora.com/How-do-you-download-a-book-from-Scribd?share=1
Another solution on Quora, don't know if it works though.
https://userscripts-mirror.org/scripts/show/160374 Userscript for scribd download. Use it through Greasemonkey.
https://www.reddit.com/r/Piracy/comments/anavw4/methods_for_downloading_book_for_scribd/?utm_source=amp&utm_medium=&utm_content=comments_view_all
This Reddit thread also mentions a few methods.
- Posts 1830
- WRZ$
276.60
- Device Kindle Fire
- OS iOS6
Space001, I have tried a thousand times to install that >*REDACTED!*< greasemonkey script but can't find the way to do it... >.< If there is any way you could help us download by using it, that would be greatly appreciated!
I have *attempted* to use that mht converter, HOWEVER, it ONLY wants files with the MHT extension, NOT MHTML. I will copy one and rename it to try one out for kicks but I am not certain I hold out a lot of hope...
I have *attempted* to use that mht converter, HOWEVER, it ONLY wants files with the MHT extension, NOT MHTML. I will copy one and rename it to try one out for kicks but I am not certain I hold out a lot of hope...
- Posts 1830
- WRZ$
276.60
- Device Kindle Fire
- OS iOS6
Nope, the mht program says file is corrupted.
Any help?
Any help?
- Posts 100
- WRZ$
746.90
Anyone find a way to rip books from scrbid? So far I have been unsuccessful in all my attempts.
I tried your method AquilaLorelei, but it only downloads the first page as .mhtml file and then when you click it to open it or change pages it takes you back to the scribd app. The full book is contained in the .json files but i have no idea what to do with those.
I tried your method AquilaLorelei, but it only downloads the first page as .mhtml file and then when you click it to open it or change pages it takes you back to the scribd app. The full book is contained in the .json files but i have no idea what to do with those.
- Posts 1830
- WRZ$
276.60
- Device Kindle Fire
- OS iOS6
@thevoiceofreason: okay, let me see if I I simplify this any, because the .mhtml file that downloaded from Scribd pulled up fine for me in the secondary .Mhtml reader program, found HERE:
https://apps.alldbx.cn/apps/5da02aa14a4629cdd23dbd2b
Cut and paste this link into your browser:
https://www.scribd.com/read/239948642/G ... d-Betrayal
As an example, say.
Hit the "Download" button, it SHOULD come out as "Greek Myths[...].mhtml
Open .mhtml reader. Open .mhtml file IN .mhtml reader by clicking grey prompt bar indicating "Select a File to Open." WARNING, it MAY show up as a "frame" initially. It also may give you a message that says "We've moved you to where you were reading on your.. " If so, close that. Also, if given a "frame," click on the "triple-stack" icon in the upper-left-hand corner. It SHOULD load the file properly, though TBH It MAY take a couple of "button-mashes," perhaps involving the "Back" button, but TRUST ME. It will only APPEAR to kick you back to Scribd, BUT if you minimize out of the program it SHOULD indicate you're in the MHT reader. Let me know if that works.
https://apps.alldbx.cn/apps/5da02aa14a4629cdd23dbd2b
Cut and paste this link into your browser:
https://www.scribd.com/read/239948642/G ... d-Betrayal
As an example, say.
Hit the "Download" button, it SHOULD come out as "Greek Myths[...].mhtml
Open .mhtml reader. Open .mhtml file IN .mhtml reader by clicking grey prompt bar indicating "Select a File to Open." WARNING, it MAY show up as a "frame" initially. It also may give you a message that says "We've moved you to where you were reading on your.. " If so, close that. Also, if given a "frame," click on the "triple-stack" icon in the upper-left-hand corner. It SHOULD load the file properly, though TBH It MAY take a couple of "button-mashes," perhaps involving the "Back" button, but TRUST ME. It will only APPEAR to kick you back to Scribd, BUT if you minimize out of the program it SHOULD indicate you're in the MHT reader. Let me know if that works.
- Posts 10
- WRZ$
50.90
- Device Not Specified
- OS iOS6
@Aquilalorelei well I researched that userscript and other methods I posted..but none of them are supposed to work on Scribd books. I didn't read things properly before posting.
The only solution I have found that is supposed to work on books is a python script that was on github but later taken down.
https://libraries.io/github/ritiek/scribd-downloader this is an archive of the project, but I don't have a Scribd premium account to test it. (It says it works on books, in the description given )
Also, if you search on Google 'json to text converter' a number of links pop up. Can you try using them to rip text from the book?
I found the mhtml converter from a Google search, there were other sites too, I posted one that worked on a mhtml from another site.(.mht is short for mhtml)
Maybe scribd does something to make them not work.
P.S. If you don't mind, can you upload a Scribd json or mhtml somewhere? I don't have a premium Scribd account.
I'll try and see if I can get something from it.
The only solution I have found that is supposed to work on books is a python script that was on github but later taken down.
https://libraries.io/github/ritiek/scribd-downloader this is an archive of the project, but I don't have a Scribd premium account to test it. (It says it works on books, in the description given )
Also, if you search on Google 'json to text converter' a number of links pop up. Can you try using them to rip text from the book?
I found the mhtml converter from a Google search, there were other sites too, I posted one that worked on a mhtml from another site.(.mht is short for mhtml)
Maybe scribd does something to make them not work.
P.S. If you don't mind, can you upload a Scribd json or mhtml somewhere? I don't have a premium Scribd account.
I'll try and see if I can get something from it.
- Posts 1830
- WRZ$
276.60
- Device Kindle Fire
- OS iOS6
Space, try this, it's a ripped .mhtml file directly from Scribd. Enjoy, HOPE it works cuz I'm looking forward to some SERIOUS book rippage in future if possible lol!
https://www19.zippyshare.com/v/HRQ0pXty/file.html
Cheers!
https://www19.zippyshare.com/v/HRQ0pXty/file.html
Cheers!
- Posts 15
- WRZ$
102.50
https://www.reddit.com/r/Piracy/comments/gbabde/progress_on_scribd_book_ripping/
New way to deal with . JSON files from Scribd. Copying the steps here:
( credit to
u/hurltossthrowaway from reddit)
Prerequisite: - Paid Scribd account - Microsoft Excel - Microsoft Word
Phase I : Extract
1. Start by logging into your Scribd account and downloading a book to your mobile device. As the named feature implies, you cannot do this from a macOS or Windows based laptop.
On iOS, the files are stored in /var/mobile/Containers/Data/Application/Scribd/Library/Application Support/documents. This folder will contain one or more folders with a numerical notation which corresponds to the document ID referenced in the Scribd URL (i.e. https://www.scribd.com/book/[documentID]).
2. Send the folder contents to a proper computer (zip/email/unzip).
3. Start a new document in Microsoft Excel and select Data > Get Data > From File > From JSON.
4. Navigate to the folder you transferred, then to the chapters folder (there appears to be one chapter folder for every item in the Scribd book's table of contents). Select the chapter folder's contents.json file and click Open.
About the contents.json structure. This structure has a lot of columns and rows, but most are not relevant to extracting text. Conversely, very few columns are relevant. As we open these structures, you want to be on the lookout for one of these three types: words, src, and cells. Each of these contain substructures with content you want. In each case, you kind of just keep drilling down in the JSON tree until there are no structures left and you arrive to the actual content.
The 'words' structure contains a 'text' substructure. If there are no further structures, this likely contains text. Otherwise you will find another 'words' substructure with text.
The 'src' structure has no substructure and contains references to images stored in the file structure which you will want to rebuild the document in all its fidelity.
The 'cells' structure is the most complex, but it's still really just a bunch of nested structures which you must click through to get to the actual text content. The structure looks a little something like this: - cells - nodes - words - text - words
5. With the contents.json file opened, you should be looking at a two-row structure, the first row named 'blocks' and the second named 'title'.
6. Click on the word 'List' in the 'blocks' row, which takes you to a new screen. Click the Convert to Table button in the top-left of your screen. A prompt appears with two questions - accept the defaults and click OK.
7.
8. Depending on the choices you made, you will get a table with one or more columns.
Some columns already appear to have actual text from the book in them. Other columns may appear to have text references to images in them. Other columns may appear blank except for the same Extract icon you saw earlier. These columns will need to be expanded further until you get additional text.
9. Once you've expanded as many src, text, or cells columns in this JSON file, click the Close & Load button in the top left. This will load the data into Excel in a tabular format.
Phase II: Transform
10. At this stage, you should have anywhere from one to four columns of data in excel, with about 1-2 words per cell.
This is the point you want to be moving the content to Microsoft Word to clean up the content with a bunch of simple search and replace operations.
These are the types of things you want to be looking for:
remove all tab characters (t) by replacing them with nothing
find all double paragraph entries (pp) and replace with a unique placeholder like "trumpnuts"
find all single paragraph entries (p) and replace with a single space (" ")
find all unique placeholders you created (like "trumpnuts") and replace with a single paragraph entry (p).
After you perform these quick search and replace operations, do a quick scan for any other cleanup activities you can fix. You can automate all the search and replace activities by creating a macro and saving the Word document as a template.
11. Save the chapter file as a .docx file. It's not a bad idea to name the document after the chapter from which it was derived.
Phase III: Load
Repeat Phases I and II for every chapter in the book.
12. When you're done, open the first chapter you converted in Microsoft Word. At the end of the document, choose Insert > Object > Text From File... and select all the other chapter documents you converted.
At this point you should have an entire book's worth of content. You may want to style the document using block styles and headings. You can generate a table of contents as well.
When you're done, import the docx file into Calibre, where you can pull in metadata, document covers, and convert to epub, PDF, azw, or whatever you want.
New way to deal with . JSON files from Scribd. Copying the steps here:
( credit to
u/hurltossthrowaway from reddit)
Prerequisite: - Paid Scribd account - Microsoft Excel - Microsoft Word
Phase I : Extract
1. Start by logging into your Scribd account and downloading a book to your mobile device. As the named feature implies, you cannot do this from a macOS or Windows based laptop.
On iOS, the files are stored in /var/mobile/Containers/Data/Application/Scribd/Library/Application Support/documents. This folder will contain one or more folders with a numerical notation which corresponds to the document ID referenced in the Scribd URL (i.e. https://www.scribd.com/book/[documentID]).
2. Send the folder contents to a proper computer (zip/email/unzip).
3. Start a new document in Microsoft Excel and select Data > Get Data > From File > From JSON.
4. Navigate to the folder you transferred, then to the chapters folder (there appears to be one chapter folder for every item in the Scribd book's table of contents). Select the chapter folder's contents.json file and click Open.
About the contents.json structure. This structure has a lot of columns and rows, but most are not relevant to extracting text. Conversely, very few columns are relevant. As we open these structures, you want to be on the lookout for one of these three types: words, src, and cells. Each of these contain substructures with content you want. In each case, you kind of just keep drilling down in the JSON tree until there are no structures left and you arrive to the actual content.
The 'words' structure contains a 'text' substructure. If there are no further structures, this likely contains text. Otherwise you will find another 'words' substructure with text.
The 'src' structure has no substructure and contains references to images stored in the file structure which you will want to rebuild the document in all its fidelity.
The 'cells' structure is the most complex, but it's still really just a bunch of nested structures which you must click through to get to the actual text content. The structure looks a little something like this: - cells - nodes - words - text - words
5. With the contents.json file opened, you should be looking at a two-row structure, the first row named 'blocks' and the second named 'title'.
6. Click on the word 'List' in the 'blocks' row, which takes you to a new screen. Click the Convert to Table button in the top-left of your screen. A prompt appears with two questions - accept the defaults and click OK.
7.
- A table should appear with one Column (named 'Column1') and a few record rows.
Click the small Expand icon to the right of the 'Column1' name. A dropdown list will appear. Be sure to click 'Load More' to ensure all columns appear.
Deselect 'All Columns' and select any checkbox named either src, cells, or text.
Click OK.
8. Depending on the choices you made, you will get a table with one or more columns.
Some columns already appear to have actual text from the book in them. Other columns may appear to have text references to images in them. Other columns may appear blank except for the same Extract icon you saw earlier. These columns will need to be expanded further until you get additional text.
9. Once you've expanded as many src, text, or cells columns in this JSON file, click the Close & Load button in the top left. This will load the data into Excel in a tabular format.
Phase II: Transform
10. At this stage, you should have anywhere from one to four columns of data in excel, with about 1-2 words per cell.
This is the point you want to be moving the content to Microsoft Word to clean up the content with a bunch of simple search and replace operations.
These are the types of things you want to be looking for:
remove all tab characters (t) by replacing them with nothing
find all double paragraph entries (pp) and replace with a unique placeholder like "trumpnuts"
find all single paragraph entries (p) and replace with a single space (" ")
find all unique placeholders you created (like "trumpnuts") and replace with a single paragraph entry (p).
After you perform these quick search and replace operations, do a quick scan for any other cleanup activities you can fix. You can automate all the search and replace activities by creating a macro and saving the Word document as a template.
11. Save the chapter file as a .docx file. It's not a bad idea to name the document after the chapter from which it was derived.
Phase III: Load
Repeat Phases I and II for every chapter in the book.
12. When you're done, open the first chapter you converted in Microsoft Word. At the end of the document, choose Insert > Object > Text From File... and select all the other chapter documents you converted.
At this point you should have an entire book's worth of content. You may want to style the document using block styles and headings. You can generate a table of contents as well.
When you're done, import the docx file into Calibre, where you can pull in metadata, document covers, and convert to epub, PDF, azw, or whatever you want.
- Posts 1830
- WRZ$
276.60
- Device Kindle Fire
- OS iOS6
Is there a way to do that with OpenOffice software instead of Microsoft? It GALLS me that something so theoretically simple should be neyond my grasp because I am using OpenOffice Calc and Writer instead of Microsoft Excel and Word. Cheers for the help!
- Posts 15
- WRZ$
102.50
- Posts 2
- WRZ$
50.20
- Posts 1830
- WRZ$
276.60
- Device Kindle Fire
- OS iOS6
Is there any way someone could download the books for me if I send the proper links?