To be or not to be

August 21, 2008

How to convert Pdf to Text in Unix? – Use xpdf’s pdftotext

Filed under: How To, Shell Script — Tags: , , — tdas @ 7:01 pm

The basic Unix distributions don’t come with an inbuilt utility to convert pdf documents to text documents. But thanks to xpdf, we can use the pdftotext command for this kind of task. Below I have listed the steps, from how to install pdftotext on your machine to using it for pdf to text document conversion.

  1. Download the source code from ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02.tar.gz ( wget will do )
  2. Untar and uncompress the archive : tar -xzf xpdf-3.02.tar.gz
  3. Go into the directory xpdf-3.02/
  4. type configure ( install in standard path)
  5. make ( you will need gcc )
  6. make install ( root priviledge needed)
  7. At this point you should have successfully installed the xpdf utilities :)
  8. Now try converting a pdf document to text : pdftotext foo.pdf

Hopefully thats helpful for someone :)

Cheers

May 30, 2008

How to convert lower-case to upper-case in C++ ?

Filed under: How To — Tags: , , , , — tdas @ 1:06 pm

Standard C library does not provide any built-in function to convert lower case to upper case and vice versa.
C++ STL has an elegant and simple solution for this : transform function.

Note : include the algorithm header file in your C++ file.

To convert Upper Case -> Lower Case:
transform( upperLine.begin(),upperLine.end(), upperLine.begin(),(int(*)(int)) tolower );

To convert Lower Case -> UpperCase:
transform( lowerLine.begin(),lowerLine.end(), lowerLine.begin(),(int(*)(int)) toupper );

Cheers

May 26, 2008

How to use the OR operator in Grep?

Filed under: How To, Shell Script — Tags: , , — tdas @ 12:44 pm

Ever wondered how to search for multiple patterns in one grep statement? One problem I often come across is, while using grep, if I want to search for something like “find all matches that start with a $ or with #“, I am stuck.

Well apparently using the OR operator in grep is trivial.

grep “^$\|^#” foo.dat ( this will return all matches that start with a $ or with # )

Note: Do NOT forget the backslash \ before the |.

Hope that will help someone :)

May 21, 2008

How to tar and untar files in UNIX?

Filed under: How To, Shell Script — Tags: , , , , , — tdas @ 3:14 am

Few years ago, during my undergraduate degree, I was asked to compress my assignment using tar and submit it. I was so scared with all the tar+compress+unix jig, that I ended up NOT submitting the assignment :O. Now when I look back, I feel so stupid. Anyways, now that I know a lil bit more about tar and untar, I’d like to share my knowledge with everyone and hopefully help someone from NOT submitting an assignment :P

Basically tar can be used to group multiple files/directories into one single file, and separate(extract) an archive created by tar into separate files.

* To group multiple files : tar -cvf foo.tar a.dat b.dat c.dat ( this will group files [a-c]*.dat to one file foo.tar )
c = create a tar file
v = verbose( nothing important :P )
f = create the tar file with filename provided as the argument

Thats all you need to know to tar(group) a bunch of files/directories.

* To tar files and gzip them : tar -czf foo.tar.gz *.dat ( this will create a gzip-compressed Tar file of the name foo.tar.gz of all files with a .dat suffix in that directory )

* To untar(separate) files from a tar archive : tar -xvf foo.tar ( this will produce three separate files a.dat, b.dat and c.dat )

* To untar(extract) a gzipped tar archive file : tar -xzf foo.tar.gz

* To untar a bzipped (.bz2) tar archive file : tar -xjf foo.tar.bz2

May 13, 2008

How To Flush/Clear Squid Cache

Filed under: How To, Shell Script — Tags: , , , — tdas @ 7:29 pm

Squid is a high-performance proxy caching server for web clients, supporting FTP, gopher, and HTTP data objects. Unlike traditional caching software, squid handles all requests in a single, non-blocking, I/O-driven process.

Sometimes we need to clear the contents in the cache and restart the program. Clearing the squid cache is brain simple:

goto the directory where the squid program resides( e.g. /etc/init.d/ )
./squid flush

You would need root(su) priviledges to perform the operation

Reference :
Squid Cache

February 17, 2008

How To Extract lines from a file in Unix?

Filed under: How To, Shell Script — Tags: , , , , — tdas @ 8:27 pm

To extract k lines from a text file in Unix, we can use a combination of head and tail.

head -20 file.dat | tail -10 //this will gives us line number [10-20] from file.dat

Another elegant and easy solution for extracting a range of lines from a text file in unix would be using sed.

cat file.dat | sed -n ‘10,20p’ > output.dat // this will also extract lines [10-20] from file.dat

How To Convert Lower Case to Upper case and Vice Versa Unix?

Filed under: How To, Shell Script — Tags: , , — tdas @ 6:32 pm

To convert lower case charactera to upper case and vice versa is a fairly common task in the computer world. In Unix this can be done very easily by using the tr command.

To convert a file containing lower case characters to upper case characters :

tr ‘[:lower:]‘ ‘[:upper:]‘ < foo.dat //Note this will change everything to upper case

To convert a file containing upper case characters to lower case characters :

tr ‘[:upper:]‘ ‘[:lower:]‘ < foo.dat //Note this will change everything to lower case

How To Section

Filed under: How To — Tags: , , — tdas @ 5:56 pm

I have been thinking about creating a How To section, where I would explain how to do simple things in the CS World. Topics can range from simple shell script to Databases to programming languages.  There are a few websites that provides an exhaustive list of HowTo’s,  out of them I really like the WikiHow website. But anyways, the main purpose of creating this section is motivated by fact that often times I forget the most trivial commands at work and end up searching the web for a solution. What better than having my own little section about all the common how to’s I have compiled over the past few years.

Blog at WordPress.com.