Hourly updates on bills

So at work I use a lot of bash scripting to allow myself time to do things like catch my breath and occasionally play some video games now and then. Basically I discovered a bunch of command line tools that allow me to get a webpage, and then look for some shit that’s on that webpage, and either return a yea or a nay depending on whether it’s found it. Optionally it’s possible to grab some data from that website and return that instead of just telling me whether or not the webpage is returning the text that it should. This is called “screen scraping”, and I found out how useful it can be a few months ago when my father passed away.

In my frame of mind it was easy not to be really paying attention to the stupid, we’ll shut your ass off from electricity if you don’t pay this notice. To be fair, they used to send you a colored piece of paper in the second attempt at getting the bill paid, and it stood out like a sore overdue notice, but nowadays it’s not worth it for them to buy the colored paper, plus everyone sends out crap on colored paper, full glossy card-stock and whatever else might grab a postal customer’s attention. I’d set up a system I thought would work for me a year ago or so, a desktop calendar that would highlight the week or so after your bills are due in red, but well, I wound up with so many bills the entire month showed up in red so that was pretty useless. I needed a better way. Behold the all powerful screen scraping shell script:

[code language=”bash”]


## These directories are configureable. I like using a separate
## user for this sort of stuff. The user in this case "scrapeuser"
## has its login disabled and can only be accessed via su after
## authenticating as root. Also the conf directory is set with
## permissions of 600. Only this user can read that directory,
## and noone can write or execute in that directory. This is
## where all of the credentials for logging into the various sites
## are located.

## Make sure we’re in the right dir
cd ${BASE}

## This iteration of curl grabs session cookies and
## viewstate and eventvalidation, used by aspx as
## credentials.

curl \
–cookie-jar ${BASE}$$.cookies \
–cookie ${BASE}$$.cookies \
–user-agent User-Agent: ‘Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:19.0) Gecko/20100101 Firefox/19.0’ \
-L \
-s \
https://www.progress-energy.com/app/loginregistration/login.aspx >${BASE}$$.tmp

VIEWSTATE=$( grep VIEWSTATE $$.tmp | cut -d ‘"’ -f "8" )
EVENTVAL=$( grep EVENTVALIDATION $$.tmp | cut -d ‘"’ -f "8" )
SMAGENT=$( grep SMAGENTNAME $$.tmp | cut -d ‘"’ -f "6" )
/bin/rm ${BASE}$$.tmp

## This iteration of curl actually goes in and fetches the data from
## the website.

curl \
–config ${CONF}progress-energy.conf \
–silent \
–location \
–location-trusted \
–cookie-jar ${BASE}$$.cookies \
–cookie ${BASE}$$.cookies \
–data-urlencode "__LASTFOCUS=" \
–data-urlencode "ctl00__scriptManager_HiddenField=" \
–data-urlencode "__EVENTTARGET=" \
–data-urlencode "__EVENTARGUMENT=" \
–data-urlencode "__VIEWSTATE=${VIEWSTATE}" \
–data-urlencode "__EVENTVALIDATION=${EVENTVAL}" \
–data-urlencode "q=Search" \
–data-urlencode "ctl00$_BodyRegion$chkRemember=on" \
–data-urlencode "ctl00$_BodyRegion$btnLogin=LOGIN" \
–data-urlencode "TARGET=-SM-http://www.progress–energy.com/app/youraccount/" \
–data-urlencode "SMAGENTNAME=${SMAGENT}" \
–referer "https://www.progress-energy.com/app/loginregistration/login.aspx" \
–user-agent ‘User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:19.0) Gecko/20100101 Firefox/19.0’ \
https://www.progress-energy.com/forms/post-pe.fcc \
|egrep ‘AcctBalance|lblPaymentDue’ \
|sed -e ‘s/<[^>]*>//g’ | tr -d ‘\011’ | tr -d ‘\015’ >>${BASE}$$.txt

## Set Variables for the pagelet

read -r AMOUNT < ${BASE}$$.txt
DUE=$( date -d `head -n2 ${BASE}$$.txt |tail -n1` +%D )

## Write out a pagelet that we can either embed in another page or
## access from a desktop widget:

echo "<HTML><HEAD><meta http-equiv="refresh" content="300"></HEAD<BODY font-family:arial, helvetica; font-color:white>" >${HTML}power.html
echo "<FONT face=sans-serif color=white>" >>${HTML}power.html
echo "<H2>ELECTRICITY PANEL</H2>" >>${HTML}power.html
echo "Total amount due: "${AMOUNT}"<br>" >>${HTML}power.html
echo "Due on: "${DUE}"<br>" >>${HTML}power.html
echo "Updated: "$( date "+%D %H:%M" )"<br>" >>${HTML}power.html
echo "</FONT></BODY></HTML>" >>${HTML}power.html

/bin/rm ${BASE}$$.cookies
/bin/rm ${BASE}$$.txt


Sorry for the constrained boxes, I’ll figure this out at some point. It’d be easier to see what’s going on by copy/pasting all that out to a text editor. I’m assuming if you’re interested in doing this you’ll be doing copypasta anyway.

The configuration file referenced above contains my username and password for the power company and looks similar to this example:

[code light=”true”]



From there I set up cron to run this thing every hour. In my crontab:

[code light=”true”]
00    *    *    *    *    /home/scrapeuser/bin/progress-energy.sh  >> /dev/null 2>&1

Once all of that is done I’ve got a widget installed on my KDE desktop (scripted html, it looks better than the webpage widget) to render the simple pagelet my script produces, and now within the hour of having the bill updated I can see exactly what I owe when:


I suppose I could just set up everything to automatically deduct, but I still feel more comfortable logging in and manually applying my payments, out of a historical fear of shorting my bank account I suppose. Now it’s off to go pay my water bill>.< Future iterations are going to make the text red and blinky in the widget if it’s within 5 days of the due date or so.

Leave a Comment

Filed under Linux/Bash Scripting

Leave a Reply

Your email address will not be published. Required fields are marked *