Writing Jupyter notebooks: avoiding legal issues

Reusing your company’s content

If you’re writing or publishing a notebook that belongs to your company (meaning your company owns the copyright to it), then you might be able to reuse other content that belongs to your company without quoting or citing it. But check with your Legal department!

Reusing third-party text

Third-party means not you or your company. It can be very tempting to copy text from third-party sources. For example, doesn’t it make sense that you should copy the sentence from the original website that perfectly describes the algorithm you’re using? Why rewrite perfection? Because that text is almost certainly copyrighted, which means you can’t legally copy it. Yes, that’s plagiarism.

Reusing third-party images

Even images that allow free use have terms of use. Check the website’s terms to see if you can use the image and whether you need to include a citation.

Using open data sets

You might assume that you can use open data sets any way you want, but they all have terms of use. For example, one of the biggest collections of open data sets, www.data.gov, has an Access & Use section for each data set. The terms of use might require you to include a citation. For example, the UCI Machine Learning Repository requires citations: https://archive.ics.uci.edu/ml/citation_policy.html.

Linking to websites

Some websites have linking policies — who knew? For example, see https://www.data.gov/privacy-policy#linking and https://www.ibm.com/legal (scroll down to the “Linking to this site” section).

The bottom line

You can’t use ANY content from a third party (that is, content that you or your company didn’t create or don’t have a copyright for) unless you follow the terms of use or get permission from the copyright holder.

Additional resources

https://www.copyright.gov/ (this website is actually more fun than you might think. Read about how you can copyright your Elvis sighting.)



