Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the s.mxtv.jp and www.skyperfectv.co.jp sites and mapping IDs for tvguide.myjcom.jp #2409

Merged
merged 16 commits into from
Oct 14, 2024

Conversation

Animenosekai
Copy link
Contributor

@Animenosekai Animenosekai commented Jul 31, 2024

Hello !

Note

This was made upon a request from the issue Animenosekai/japanterebi-xmltv#1

I've just created two new parsers for the sites/sources:

  • s.mxtv.jp: Providing EPG data for Tokyo MX1 and Tokyo MX2 (which was missing previously)
  • www.skyperfectv.co.jp : Providing EPG data for a bunch of specific or paid channels

I also added TSS from JCOM and matched the following IDs from JCOM:

  • MTV
  • Mystery Channel
  • Space Shower
  • MONDO
  • Nikkei CNBC
  • Pachinko Pachislo
  • GSTV

Important

This does not add any dependency since the repository already depends on cheerio, axios and dayjs

🎐

@Animenosekai Animenosekai marked this pull request as ready for review July 31, 2024 18:15
@Animenosekai
Copy link
Contributor Author

Marked as ready to merge since I successfully tested the two new parsers and there were no issues.

@Animenosekai Animenosekai linked an issue Aug 1, 2024 that may be closed by this pull request
@PopeyeTheSai10r
Copy link
Collaborator

PopeyeTheSai10r commented Aug 12, 2024

s.mxtv.jp, skyperfectv.co.jp and tvguide.myjcom.jp all passed the test.
s.mxtv.jp and tvguide.myjcom.jp are returning data.

skyperfectv.co.jp is returning an unexpected output when ran using
npm run grab -- --site=www.skyperfectv.co.jp

Copy link
Contributor

@BellezaEmporium BellezaEmporium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't test the channel output for now, though a few remarks :

  • While it's coming from a good intention to add a README on the SkyPerfect and s.mxtv folder, we simply do not do per-site READMEs
  • Simply call it "skyperfect.co.jp", will be easier for all of us I guess.
  • I suppose the yarn.lock change is linked to your computer, though I do not know if it's a needed change in GH.

Copy link
Collaborator

@freearhey freearhey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the “www” from the file names of the “skyperfectv.co.jp” to match the rest of the files in the repository and roll back the changes to the yarn.lock file.

Otherwise everything looks fine to me.

@Animenosekai
Copy link
Contributor Author

Please remove the “www” from the file names of the “skyperfectv.co.jp” to match the rest of the files in the repository

I left the www because the site is not accessible without www in front of the domain but I guess I'll change it then.

roll back the changes to the yarn.lock file

Yup I'll do that just now.

While it's coming from a good intention to add a README on the SkyPerfect and s.mxtv folder, we simply do not do per-site READMEs

To be honest I just copied the structure of an already existing site folder.

@Animenosekai
Copy link
Contributor Author

And for the www I copied www3.nhk.or.jp which has www3 as a subdomain in front of the domain.

@BellezaEmporium
Copy link
Contributor

Please remove the “www” from the file names of the “skyperfectv.co.jp” to match the rest of the files in the repository

I left the www because the site is not accessible without www in front of the domain but I guess I'll change it then.

roll back the changes to the yarn.lock file

Yup I'll do that just now.

While it's coming from a good intention to add a README on the SkyPerfect and s.mxtv folder, we simply do not do per-site READMEs

To be honest I just copied the structure of an already existing site folder.

Don't change it in the code if it's necessary, just the folder name, in order to fit the naming standard we have.

freearhey
freearhey previously approved these changes Aug 25, 2024
@BellezaEmporium
Copy link
Contributor

Certain NSFW (18+) channels are excluded from the EPG data because there's a confirmation needed. A cookie should be added in order to make sure it doesn't give incomplete data.

@BellezaEmporium
Copy link
Contributor

BellezaEmporium commented Aug 27, 2024

By the way, channel 536 and channel 537 have got the same tvg-id for a different "premium ID".
image
Is it on purpose ? It could clash when being read by a player.

@BellezaEmporium
Copy link
Contributor

BellezaEmporium commented Aug 27, 2024

Here's the code that works, on my side :

const axios = require('axios')
const dayjs = require('dayjs')
const cheerio = require('cheerio')
const duration = require('dayjs/plugin/duration')
const utc = require('dayjs/plugin/utc')
const timezone = require('dayjs/plugin/timezone')
const customParseFormat = require('dayjs/plugin/customParseFormat')

dayjs.extend(utc)
dayjs.extend(timezone)
dayjs.extend(customParseFormat)
dayjs.extend(duration)

module.exports = {
    site: 'skyperfectv.co.jp',
    days: 1,
    lang: 'ja',
    url: function ({ date, channel }) {
        let [type, ...code] = channel.site_id.split('_')
        code = code.join('_')
        return `https://www.skyperfectv.co.jp/program/schedule/${type}/channel:${code}/date:${date.format('YYMMDD')}`
    },
    logo: function ({channel}) {
        return `https://www.skyperfectv.co.jp/library/common/img/channel/icon/basic/m_${channel.site_id.toLowerCase()}.gif`
    },
    // Specific function that permits to gather NSFW channels (needs confirmation)
    async fetchSchedule({ date, channel }) {
        const url = this.url({ date, channel });
        const response = await axios.get(url, {
            headers: {
                'Cookie': 'adult_auth=true'
            }
        });
        return response.data;
    },
    async parser({ date, channel }) {
        const sched = await this.fetchSchedule({ date, channel });
        const $ = cheerio.load(sched)
        const programs = []

        const sections = [
            { id: 'js-am', addition: 0 },
            { id: 'js-pm', addition: 0 },
            { id: 'js-md', addition: 1 }
        ]

        sections.forEach(({ id, addition }) => {
            $(`#${id} > td`).each((index, element) => {
                // `td` is a column for a day
                // the next `td` will be the next day
                const today = date.add(index + addition, 'd').tz('Asia/Tokyo')
                
                const parseTime = (timeString) => {
                    // timeString is in the format "HH:mm"
                    // replace `today` with the time from timeString
                    const [hour, minute] = timeString.split(':').map(Number)
                    return today.hour(hour).minute(minute)
                }
                
                const $element = $(element) // Wrap element with Cheerio
                $element.find('.p-program__item').each((itemIndex, itemElement) => {
                    const $itemElement = $(itemElement) // Wrap itemElement with Cheerio
                    const [start, stop] = $itemElement.find('.p-program__range').first().text().split('〜').map(parseTime)
                    const title = $itemElement.find('.p-program__name').first().text()
                    const image = $itemElement.find('.js-program_thumbnail').first().attr('data-lazysrc')
                    programs.push({
                        title,
                        start,
                        stop,
                        image
                    })
                })
            })
        })
        
        return programs
    },
    async channels() {
        const pageParser = (content, type) => {
            // type: "basic" | "premium"
            // Returns an array of channel objects

            const $ = cheerio.load(content)
            const channels = []
        
            $('.p-channel').each((index, element) => {
                const site_id = `${type}_${$(element).find('.p-channel__id').text()}`
                const name = $(element).find('.p-channel__name').text()
                channels.push({ site_id, name, lang: 'ja' })
            })
        
            return channels
        }

        const getChannels = async (type) => {
            const response = await axios.get(`https://www.skyperfectv.co.jp/program/schedule/${type}/`, {
                headers: {
                    'Cookie': 'adult_auth=true;'
                }
            })
            return pageParser(response.data, type)
        }
        
        const fetchAllChannels = async () => {
            const basicChannels = await getChannels('basic')
            const premiumChannels = await getChannels('premium')
            const results = [...basicChannels, ...premiumChannels]
            return results
        }

        return await fetchAllChannels()
    }
}

@Animenosekai
Copy link
Contributor Author

By the way, channel 536 and channel 537 have got the same tvg-id for a different "premium ID". image Is it on purpose ? It could clash when being read by a player.

Indeed, it was SITE777TV.jp

Copy link
Contributor

@BellezaEmporium BellezaEmporium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@freearhey freearhey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test failed:

npm test -- skyperfectv.co.jp

> test
> run-script-os skyperfectv.co.jp


> test:default
> TZ=Pacific/Nauru npx jest --runInBand skyperfectv.co.jp


 RUNS  sites/skyperfectv.co.jp/skyperfectv.co.jp.test.js
/Users/Arhey/Code/iptv-org/epg/sites/skyperfectv.co.jp/skyperfectv.co.jp.config.js:36
        const sched = await this.fetchSchedule({ date, channel });
                                 ^

TypeError: Cannot read properties of undefined (reading 'fetchSchedule')
    at parser (/Users/Arhey/Code/iptv-org/epg/sites/skyperfectv.co.jp/skyperfectv.co.jp.config.js:36:34)
    at Object.<anonymous> (/Users/Arhey/Code/iptv-org/epg/sites/skyperfectv.co.jp/skyperfectv.co.jp.test.js:25:18)
    at Promise.then.completed (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-circus/build/utils.js:298:28)
    at new Promise (<anonymous>)
    at callAsyncCircusFn (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-circus/build/utils.js:231:10)
    at _callCircusTest (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-circus/build/run.js:316:40)
    at _runTest (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-circus/build/run.js:252:3)
    at _runTestsForDescribeBlock (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-circus/build/run.js:126:9)
    at run (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-circus/build/run.js:71:3)
    at runAndTransformResultsToJestFormat (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-circus/build/legacy-code-todo-rewrite/jestAdapterInit.js:122:21)
    at jestAdapter (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-circus/build/legacy-code-todo-rewrite/jestAdapter.js:79:19)
    at runTestInternal (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-runner/build/runTest.js:367:16)
    at runTest (/Users/Arhey/Code/iptv-org/epg/node_modules/jest-runner/build/runTest.js:444:34)

Node.js v18.18.2

@PopeyeTheSai10r
Copy link
Collaborator

@Animenosekai - skyperfectv.co.jp is failing. It either needs to be updated or removed.

@Animenosekai
Copy link
Contributor Author

Screenshot 2024-10-13 at 21 38 39

@PopeyeTheSai10r Should work fine now 👍

Copy link
Contributor

@BellezaEmporium BellezaEmporium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for s.mxtv.jp, their website is so counterintuitive...

All the necessary stuff you should find it in https://s.mxtv.jp/bangumi/js/common_bangumi.js?20240311 / https://s.mxtv.jp/bangumi/js/timetable.js?20240820

@Animenosekai what you can try is this :

GET https://s.mxtv.jp/bangumi/link/weblinkU.csv?1728896511558 & https://s.mxtv.jp/bangumi/link/weblinkU_ir.csv?1728896511558 (timestamp in epoch, Asia/Tokyo TZ), compare the names in your EPG result with the ones in this CSV, GET the image linked to said line based on program name. It will need some CSV parsing.

You may also use a library like "didyoumean2", that'll use the Levenshtein algorithm to get the closest name it can find if it isn't a perfect match. I cannot try it as I'm not a Japanese speaker myself, so I believe you'll be luckier than me if it works fine on your side.

@freearhey freearhey merged commit 7610f7b into iptv-org:master Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Channel Suggestions
4 participants