[Tool/Web] LDraw to Studio Exporter

RE: [Tool/Web] Get specific part by id out or LDraw with all needed files included
So I kind of did a Python crash course and tried to do a small crawler for the unofficial stuff.

It looks pretty good so far, doing what I intended it to do, given my Python experience of 0, I consider this a major goal :-)

you pass it a URL of a part in the unofficial lib and it will compile a list of subparts required to be downloaded so that the part "works" You can then (in future, once I manage the download and zipping) pass the package to e.g. Studio

So it looks into the the section of the website "Required (unofficial) subfiles" and recursively walks through them.

import requests
import time
from bs4 import BeautifulSoup
from urllib.parse import urljoin

class CrawledPart():
    def __init__ (self, Part, PartLink, DATLink):
        self.Part = Part
        self.PartLink = PartLink
        self.DATLink = DATLink

class PartFetcher():
    def fetch(self, partno):
        # u9247 (lots of data)
        # u9576 (no subfiles)
        print ("Part to Fetch: " + partno )
        r= requests.get(partno)

        # cut off the parent files at the marker for the RELATED subfiles, we do not need them
        doc = BeautifulSoup(r.text.split ("Related")[0],"html.parser")
        # doc = BeautifulSoup(r.text,"html.parser")

        # class .list contains the list of the required sub-parts
        link = doc.select (".list")

        if len(link)> 0:
            for subpart in (link[0].select(".header")):
                Part = subpart.attrs["href"]
                PartLink = urljoin (url, subpart.attrs["href"])
                DATLink = urljoin (liburl, subpart.attrs["href"].split("=")[1])
                crawled = CrawledPart(Part, PartLink, DATLink)
                crawledparts.append (crawled)
                print ("Subpart: " + PartLink)
        return crawledparts

and the call
subparts = PartFetcher()
parturl= "https://www.ldraw.org/cgi-bin/ptdetail.cgi?f=parts/71603.dat"
for item in subparts.fetch(parturl):
    print (item.DATLink)

What remains:
- Include the root parts link as well
- download now the single files
- pack them in a ZIP with the correct folder structure
« Next Oldest | Next Newest »

Messages In This Thread
RE: [Tool/Web] Get specific part by id out or LDraw with all needed files included - by Gerald Lasser - 2022-07-13, 16:37

Forum Jump:

Users browsing this thread: 1 Guest(s)