Python: PdfFileReader PdfReadError: EOF marker not found

I had this error when I was trying to read the content of a PDF file using python lib pyPDF.

PdfFileReader(StringIO.StringIO(result.content))

it turns out to be that the PDF file I was trying to read is not a standard PDF file.

Standard PDF ends with “%%EOF” marker within the last 1024 bytes of the file.

But in order to see the content of a PDF, it does not have to be a standard one. So for this “broken” PDF, I could still view it correctly using most PDF reader but throws this EOF marker not found exception when I am using pyPDF library.

The solution it to change the pyPDF/pdf.py files:

def read(self, stream):
    # start at the end:
    stream.seek(-1, 2)
    line = ''
    while not line:
        line = self.readNextEndLine(stream)
    if line[:5] != "%%EOF":
        raise utils.PdfReadError, "EOF marker not found"

To:


def read(self, stream):
    # start at the end:
    stream.seek(-1, 2)
    line = ''
    while line[:5] != "%%EOF":
        line = self.readNextEndLine(stream)

Javascript Base64 Encoding Decoding

I am having this painful thing because I was using Google’s GMail API.

I was trying to retrieve the body of the Email which turns out to be Base64 data.

The very first approach I could find is to use atob() function.

This works for some of the cases.

But not all of them.

When I decode using this atob() function, I sometimes see this error.

Uncaught InvalidCharacterError: Failed to execute ‘atob’ on ‘Window': The string to be decoded is not correctly encoded.

Test data is:

PHA-Tm8gcHJvYmxlbSBhdCBhbGwsIEkgaHVnZWx5IGFwcHJlY2lhdGUgeW91ciBxdWljayByZXNwb25zZSwgZm9yIG1ha2luZyBhbiBhd2Vzb21lIHBsdWdpbiBmb3IgdXMgYWxsIHRvIGVuam95LCBhbmQgZm9yIGNvbnRpbnVpbmcgdG8gcHJvdmlkZSBzdXBwb3J0IDwvcD4NCg0KPHAgc3R5bGU9ImZvbnQtc2l6ZTpzbWFsbDstd2Via2l0LXRleHQtc2l6ZS1hZGp1c3Q6bm9uZTtjb2xvcjojNjY2OyI-Jm1kYXNoOzxicj5SZXBseSB0byB0aGlzIGVtYWlsIGRpcmVjdGx5IG9yIDxhIGhyZWY9Imh0dHBzOi8vZ2l0aHViLmNvbS9tYmVuZm9yZC9uZ1RhZ3NJbnB1dC9pc3N1ZXMvNDA2I2lzc3VlY29tbWVudC04NzI1MzgyMCI-dmlldyBpdCBvbiBHaXRIdWI8L2E-LjxpbWcgYWx0PSIiIGhlaWdodD0iMSIgc3JjPSJodHRwczovL2dpdGh1Yi5jb20vbm90aWZpY2F0aW9ucy9iZWFjb24vQUQ1N1U0eERaY3pvXzNITnZBYnpOMm9udmlBZ3lWdFJrczVuNXNjb2dhSnBaTTREeGs3Ty5naWYiIHdpZHRoPSIxIiAvPjwvcD4NCjxkaXYgaXRlbXNjb3BlIGl0ZW10eXBlPSJodHRwOi8vc2NoZW1hLm9yZy9FbWFpbE1lc3NhZ2UiPg0KICA8ZGl2IGl0ZW1wcm9wPSJhY3Rpb24iIGl0ZW1zY29wZSBpdGVtdHlwZT0iaHR0cDovL3NjaGVtYS5vcmcvVmlld0FjdGlvbiI-DQogICAgPGxpbmsgaXRlbXByb3A9InVybCIgaHJlZj0iaHR0cHM6Ly9naXRodWIuY29tL21iZW5mb3JkL25nVGFnc0lucHV0L2lzc3Vlcy80MDYjaXNzdWVjb21tZW50LTg3MjUzODIwIj48L2xpbms-DQogICAgPG1ldGEgaXRlbXByb3A9Im5hbWUiIGNvbnRlbnQ9IlZpZXcgSXNzdWUiPjwvbWV0YT4NCiAgPC9kaXY-DQogIDxtZXRhIGl0ZW1wcm9wPSJkZXNjcmlwdGlvbiIgY29udGVudD0iVmlldyB0aGlzIElzc3VlIG9uIEdpdEh1YiI-PC9tZXRhPg0KPC9kaXY-DQo=

You can try this.  I do know this is valid base64 data because I could decode this correctly using python.

import base64

base64.urlsafe_b64decode(data)

After searching quite a lot, I found this which works great for me.

https://github.com/dankogai/js-base64

I also heard that in Google Closure Library, goog.crypt.base64 is also a great one for encoding and decoding stuff. Cause I haven’t found out how to quick use this in my js, so I didn’t try this out.

Error: wiredep:app running grunt serve after yoeman generates angular project

yoeman is a scaffolding tool helpful for developers.


$ yo angular

creates an angular folder for you.

But it surprises me that after doing this, when I run


$ grunt serve

To start the localhost server, it gives me back error.

Running “wiredep:app” (wiredep) task
Warning: ENOENT, no such file or directory ‘/Users/txzhang/git/mm-motohire/app/bower.json’ Use –force to continue.

Aborted due to warnings.

To fix this, change the file Gruntfile.js in the root folder of your project.


wiredep: {
options: {
cwd: '<%= yeoman.app %>'
},
app: {
src: ['<%= yeoman.app %>/index.html'],
ignorePath: /\.\.\//
}
},

remove the line cwd:'<%= yeoman.app %>.

Then you would be good to go.

How to use custom font in web HTML

Sometimes you might want to use your own custom font for your web application.

Here is the quick way to do this.

Important: Make sure the file font-file.ttf stays in the same folder where the css file stays. I haven’t tried with relative position of the font-file in argument for url(). Maybe it would work.

In css file, you define the font-familiy


@font-face { font-family: font-name; src: url('font-file.ttf'); } 

#element_id{

     font-familiy: font-name;      

}

So then in your html, you could see your words in custom font


<p id="element_id">The words to show</p>

Build Angular Google App Engine Cloud Endpoints Application Notes Python

1 Build angular folders

Use Yeoman to help build Angular layout folder structure.

Running yo angular and select what library is needed for the probject.

2 Create app.yaml in the root folder of the project.


application: application-name
version: 0-1
runtime: python27
api_version: 1
threadsafe: true

skip_files:
- ^(.*/)?.*node_modules/.*$

handlers:
# Endpoints handler
- url: /_ah/spi/.*
script: endpoints.app
secure: always
- url: /bower_components
static_dir: bower_components

- url: (.*)/
static_files: app\1/index.html
upload: app
login: required
secure: always

- url: (.*)
static_files: app\1
upload: app
login: required
secure: always

libraries:
- name: endpoints
version: "1.0"

Google App Engine: PDF reportlab Import Error ImportError: Cannot re-init internal module __main__

When using python library reportlab in appengine, I have met this import error.

 File "/Users/bmh/GitRepos/MyProject/charges/views.py", line 15, in <module>
    from MyProject.invoices import generate_invoice, format_long_number
  File "/Users/bmh/GitRepos/MyProject/charges/invoices.py", line 23, in <module>
    from reportlab.lib import colors
  File "/Users/bmh/GitRepos/MyProject/contrib/reportlab/lib/colors.py", line 44, in <module>
    from reportlab.lib.rl_accel import fp_str
  File "/Users/bmh/GitRepos/MyProject/contrib/reportlab/lib/rl_accel.py", line 19, in <module>
    import __main__
ImportError: Cannot re-init internal module __main__

This import error will show up if you are running a old version of reportlab in appengine.

https://bitbucket.org/rptlab/reportlab/issue/27/cannot-re-init-internal-module-__main__-on

This is the issue people have submitted about this import error and it’s already fixed.

See this change.

https://bitbucket.org/rptlab/reportlab/commits/ca6c60fd1f627a0f9c040b370ef52f9f4496d6f5

So in order to make everything works, just go to the bitbucket repository and download the latest code(Version 3.1.10 or above should be fine).

https://bitbucket.org/rptlab/reportlab/downloads

Leetcode: Database Department Top Three Salaries

The Employee table holds all employees. Every employee has an Id, and there is also a column for the department Id.

+----+-------+--------+--------------+
| Id | Name  | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Henry | 80000  | 2            |
| 3  | Sam   | 60000  | 2            |
| 4  | Max   | 90000  | 1            |
| 5  | Janet | 69000  | 1            |
| 6  | Randy | 85000  | 1            |
+----+-------+--------+--------------+

The Department table holds all departments of the company.

+----+----------+
| Id | Name     |
+----+----------+
| 1  | IT       |
| 2  | Sales    |
+----+----------+

Write a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows.

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Max      | 90000  |
| IT         | Randy    | 85000  |
| IT         | Joe      | 70000  |
| Sales      | Henry    | 80000  |
| Sales      | Sam      | 60000  |
+------------+----------+--------+

This question seems don’t have too much limitation about the order of the result after trying different solutions.


SELECT d.Name AS Department, e1.Name AS Employee, Salary
FROM Employee e1, Department d
WHERE (SELECT COUNT(DISTINCT(e2.Salary)) 
       FROM Employee e2 
       WHERE e2.DepartmentId = e1.DepartmentId 
             AND e2.Salary > e1.Salary
      ) < 3 AND e1.DepartmentId = d.Id