Introduction
本次需要代写的Python作业,包含了5个算术问题需要解决。
Problem 1 - Decrypting Government Data
Your job is to summarize this gov data about oil consumation
- The format of the file is rather bizzare - note that each line has data for two months, in two different years! (Plus I had to hand edit the file to make it parseable)
- Fortunately, Python is great for untangling and manipulating data.
- Write a generator that reads from the given url over the network, and produces a summary line for a years data on each next call
- remember that urllib.request returns bytes arrays, not strings
- The generator should read the lines of the oil2.txt file in a lazy fashion - it should only read 13 lines for every two years of output. Note a loop can have any number of yield calls in it.
- Ignore the monthly data, just extract the yearly info
- Drop the month column
- In addition to the oil generator function, my solution had a separate helper function, def makeCSV- Line(year, data):
Here is the first two years of data, 2014 and 2013
Year,Quantity,QuantityChange,Unknown,Unknown2,Price,PriceChange2014,2700903,-112867,246409332,-26397845,91.23,-5.722013,2813770,-283638,272807177,-40367786,96.95,-4.152012,3097408,-224509,313174963,-18407090,101.11,1.292011,3321917,-55160,331582053,79421544,99.82,25.152010,3377077,62290,252160509,63448733,74.67,17.742009,3314787,-275841,188711776,-153200712,56.93,-38.292008,3590628,-99940,341912488,104700835,95.22,30.952007,3690568,-43658,237211653,20584322,64.28,6.262006,3734226,-20445,216627331,40871990,58.01,11.202005,3754671,-66308,175755341,44012676,46.81,12.332004,3820979,144974,131742665,32575492,34.48,7.502003,3676005,257983,99167173,21883842,26.98,4.372002,3418022,-53045,77283331,2990437,22.61,1.212001,3471067,71827,74292894,-15583539,21.40,-5.042000,3399240,171148,89876433,38986812,26.44,10.681999,3228092,-14620,50889621,13637399,15.76,4.281998,3242712,173281,37252222,-16973685,11.49,-6.181997,3069431,175785,54225907,-704950,17.67,-1.321996,2893646,126333,54930857,11181204,18.98,3.17
now that we have something that looks like a CVS file, can do all kinds of things
- could save it to a file then
- excel, openoffice could read it
- Python has a CVS Reader
- with a little juggling, can easily pump the data into a panda DataFrame
Input:
1 | with open('/tmp/oil.csv', 'w') as f: |
Output:
1 | Year Quantity QuantityChange Unknown Unknown2 Price PriceChange |
Input:
1 | [df['Price'].mean(), df['Price'].min(), df['Price'].max()] |
Output:
1 | [46.63681818181818, 11.49, 101.11] |
Problem 2
- suppose we want to convert between C(Celsius) and F(Fahrenheit), using the equation 9C = 5 (F-32)
- could write functions c2f and f2c
- do all computation in floating point for this problem
Input:
1 | def c2f(c): |
Output:
1 | [32.0, 212.0, 0.0, 100.0] |
- to write f2c, we solved the equation for C, and made a function out of the other side of the equation
- to write c2f, we solved for F, . . .
- there is another way to think about this
- rearrange the equation into a symmetric form 9 * C - 5 * F = -32 * 5
- you can think of the equation above as a constraint between F and C. if you specify one variable, the others value is determined by the equation. in general, if we have c0 * x0 + c1 * x1 + cN * xN = total
- cI are fixed coefficients
- specifying any N of the (N + 1) xs will determine the remaining x variable
- define a class, Constaint that will do constraint satisfaction
- you may find dotnone to be helpful
Input:
1 | # regular dot product, except that if or both values in a pair is 'None', |
Output:
1 | [32, 22, 0] |
Input:
1 | # setup constraint btw C and F |
Output:
1 | [100.0, 212.0] |
Problem 3 - Hamlet
- Python is very popular in digital humanities
- MIT has the complete works of Shakespeare in a simple html format
- You will do a simple analysis of Hamlet by reading the html file, one line at a time(usual iteration scheme) and doing pattern matching
- The goal is to return a list of the linecnt, total number of speeches(look at the file format), and a dict showing the number of speeches each character gives
- Your program should read directly from the url given, but you may want to download a copy to examine the structure of the file.
- remember that usrlib.request returns byte arrays, not strings
- heres a short sample of the file
1 | <A NAME=speech25><b>HORATIO</b></a> |
Input:
1 | hamlet(url) |
Output:
1 | [8881, |
Problem 4
- in class, we discussed two different ways to represent a polynomial
- polylist, a dense represenation, that hold the coefficients in a list
- polydict, a sparse representation, that holds (exponent, coefficent) pairs in a dict
- add a method, topolydict() to class polylist, that converts the polylist into a polydict
- add a method, topolylist() to class polydict, that converts the polydict into a polylist
- note that polylist->polydict will always work, but polydict->polylist can fail, because a polylist cannot represent negative exponents. in this case, raise a ValueError
- just to tell them apart, polylist prints with a leading +
Input:
1 | pl1 = polylist([1, 2, 3]) |
Output:
1 | [+ 3 * X ** 2 + 2 * X + 1, |
Input:
1 | [pl1.topolydict(), pl2.topolydict(), pd1.topolylist(), pd2.topolylist()] |
Output:
1 | [3 * X ** 2 + 2 * X + 1, 5 * X ** 2 + 10 * X, + 3 * X ** 2 + 2 * X + 1, + 5 * X ** 2 + 10 * X] |
Problem 5
define the __mul__ method for polydict
Input:
1 | [pd1, pd2, pd3, pd1 * pd2, pd1 * pd3, pd2 * pd3] |
Output:
1 | [+ 3 * X ** 2 + 2 * X + 1, |