Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 3 Current »

\uD83D\uDDD3 Date

\uD83D\uDC65 Participants

\uD83E\uDD45 Agenda

\uD83D\uDDE3 Discussion topics

Item

Notes

Last meeting

  • Covered tools and data pipeline.

    • 2 repos: api-tools and nebula-api

      • api-tools: scraping, parsing, and uploading

      • nebula-api: talks between mongodb and user

Tasks/issues

  • Some L1 tasks have opened up from people who didn’t do them last semester

    • Fix/suppress some annoying chromedp errors

    • fix parser bug pulling the room as “No Meeting“ and others

    • add /events/:day/:building/:room/sections route

    • unit testing!

      • Josh will give some examples by next meeting for anyone to do the rest of them

    • on hold tasks - feel free to do if interested

      • astra isn’t as useful as we thought so schemas for that scraper are on hold

      • changing api-tools to not have to upload and replace all the data on mongodb each scrape+parse+upload

How the team works

  • Josh puts out tasks in batches, pick one up and work on it

  • Feel free to work in groups

rooms-route branch

  • route

    • adds OPTIONS and GET routes

  • controller

    • pulls data from mongodb

  • work already done on the database

    • aggregations in the database to transform data

    • sections creates the rooms and events views

      • rooms shows every classroom

      • events shows every section on a day

    • doing this in mongo helps keep the code from being messy

Tech stack: stuff to learn

  • Language: GoLang

  • Database: MongoDB

  • Scraping: ChromeDP

  • Routing: Gin

  • Check #announcements in the API channel for resources

Tool automation - big area of contribution

  • The tools all run in CLI and could be automated

    • But there always seem to be a few issues to overcome in scraping

    • And the rate we scrape at is slow because coursebook locks out out at some point

      • IP rate limit means you can only scrape 1-2 semesters

    • CAPTCHAs pop up at random times and stop all scraping

Cloud storage

  • Store images and videos

    • Not in MongoDB

  • Google Could Storage holds large binary files called:

    • buckets: like directories, one for each nebula projects

    • objects: items in buckets

  • Why not just have each nebula project use their own Google Cloud Storage

    • the APIs are difficult and over complicated

    • API’s goal is to make managing buckets and objects easy

    • stuff like basic crud operations /bucket/create, etc.

How to get the nebula-api running?

  • You’ll need Go installed

  • Clone the repo with git

  • Copy the .env.template file to a file named .env and fill out at least the DATABASE_URI

  • Run make build on mac or .\build.bat on windows

  • Run ./go-api.exe to run the API and go to http://localhost:8080/swagger/index.html to test it

  • You can also build it with docker which is what we do to deploy the API

⤴ Decisions

    ✅ Action items

    •  
  • No labels