Convert PDF to MP3 in Linux

Date: 18 Dec 2010 Comments: 0

http://ubuntuforums.org/showthread.php?t=1364786

Click below to see the python script from the link with the instructions embedded.

#!/usr/bin/python
# ###################################################
# pdf2mp3.py – little script/program to convert a
# pdf-file or ascii-file (.dat, .txt) into a mp3 audio file
#
# Copyright (C) 2010 Hannes Rennau
# kontakt-hannes@hannijanni.de
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the
# Free Software Foundation, Inc.,
# 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
# ###################################################

# LIST OF PACKAGES NEEDED:
# you need to install the following packages:
# sudo apt-get install python poppler-utils festival festvox-rablpc16k lame

# HOW TO USE:
# 1.create a file with the name pdf2mp3.py and copy the content of
# the whole text in there
# 2.make the file an executable via:
# >>> chmod +x pdf2mp3.py
# 3.copy file to /usr/bin to make usage of program possible from everywhere on your computer:
# >>> sudo cp pdf2mp3.py /usr/bin/
# 4.after that just call script via:
# >>> pdf2mp3.py yourfilename.pdf
# or:
# >>> pdf2mp3.py yourfilename.dat
# or:
# >>> pdf2mp3.py yourfilename.txt
#
# and a file called yourfilename.mp3 is being created

# ADDITONAL INFO:
# in case you’ve lots of tables, equations or whatever else in your pdf,
# it’s a nice think if you delete all strange symbol character stuff in
# in the yourfilename.txt-file and then run the script again but call it via:
# pdf2mp3.py yourfilename.txt

import os,sys
import string

# get filenname
ifpdf=1
filename_inp = sys.argv[1]

if filename_inp[-4:]==’.dat’ or filename_inp[-4:]==’.txt’:
ifpdf=0

if filename_inp[-4:]==’.pdf’ or filename_inp[-4:]==’.dat’ or filename_inp[-4:]==’.txt’:
filename = filename_inp[:-4]
if os.path.isfile(str(filename_inp)):
if ifpdf:
print ‘converting pdf to ascii…’
os.popen(‘pdftotext ‘ + str(filename) + ‘.pdf ‘ + str(filename) + ‘.txt’)
print ‘start converting ascii to wav…’
os.popen(‘cat ‘ + str(filename) + ‘.txt\
|sed \’s/[^a-zA-Z .,!?]//g\’|text2wave -o ‘ + str(filename) + ‘.wav’)
print ‘converting to mp3…’
os.popen(‘lame -f ‘ + str(filename) + ‘.wav ‘ + str(filename) + ‘.mp3’)
os.popen(‘rm -f ‘ + str(filename) + ‘.wav’)
print ‘finished. don\’t forget to delete txt file if you don\’t need it’
else:
print ‘*** file does not exist ***’
else:
print ‘*** please give extension of your file (.dat, .txt or .pdf)’
print ‘*** or your file called: ‘ + str(filename_inp) + ‘\n*** does not exist’

# that’s it. have fun!!! :)

Leave a Reply

You must be logged in to post a comment.