Skip to content

Latest commit

 

History

History
554 lines (371 loc) · 32.3 KB

2021-07-24.md

File metadata and controls

554 lines (371 loc) · 32.3 KB

< 2021-07-24 >

1,974,111 events, 1,133,032 push events, 1,632,085 commit messages, 101,695,347 characters

Saturday 2021-07-24 01:40:13 by miamiv1ce

Grief Project

  • Added images, fonts and other visual assets to make a test on what energy I felt I wanted to go for but I'm not too sure about the whole 90's collage anymore. I'm alo very tired and not really motivated so maybe i'm just being weird. Anyway I tried another style (more clean and 3D and less dirty) and it looks nice but i'm not getting the funny feeling i'm looking for so i'll probably switch back. I need to think about how I want the user to fell when first arriving on the software and also how I want the contrast to hit once the game is actually playing.

Saturday 2021-07-24 02:36:59 by JamieHoSzeYui

manifest: fuck you audio HAL

Signed-off-by: JamieHoSzeYui [email protected] Change-Id: I6ba573d68c2cda8bec8876ac36f978bbc734ac66


Saturday 2021-07-24 05:10:40 by Christopher Bayliss

switch back to bash

yeah, I agree, this prompt config is dog shit ugly.

fuck shell languages. I should write it in a better langauge I guess


Saturday 2021-07-24 06:08:20 by Davin Chia

Use Cloudrepo.io to host temporary built jars. (#4949)

Continuation of my saga to publish a public jar while we wait for the upstream fix to be merged in.

Turns out using the GCS bucket as a public maven repository still requires GCS login. This is not feasible since it requires all our community contributors to have and sign into gcloud. This also required injecting this everywhere in the build and was extremely ugly.

We could publish to the central maven repository, but it takes 2 days for us to get an OSS account. We'd also have to dedup on artifact id. I explored GCP's Artifact Registry, but I didn't manage to set it up after an hour. It is also only possible to upload via Maven/Gradle and there were a lot of steps to jump through to authenticate. I went with CloudRepo as they were the best in term of set up, UX and performance. I was able to set something up in 10 mins. I was able to upload my Jars manually via the UI. CoudRepo also supports public read-only repository, which is exactly what we need (publish is still gated on authentication). Ran into a slight hiccup and support got back to me in 30 mins. Frankly think we should consider using them for our private cloud. They are also fairly cheap. Anyone who wants to upload jars will have to use credentials that I've added to shared-engineering folder. Just navigate to the CloudRepo url.


Saturday 2021-07-24 06:45:10 by Wilmer Adalid

Updates for: To find out a girl's faults, praise her to her girl friends. -- Benjamin Franklin


Saturday 2021-07-24 07:32:04 by profmamaafuwa

Update README.md

Lost love back spells are magical ancient service of astrology that is not originated from only one place. If you seem that it almost impossible to get return your love back then leave that negative feeling.

From each corner of this world these love spells are originated and somewhere it probably impossible to pinpoint the origination of spells in love. Aim of love spells is clearly defined to get or influencing the desired person. When someone hears about spells then one thing is common to come in mind.


Saturday 2021-07-24 08:25:54 by sm0kin

Fix for Attacked Recently Returned Region not working as intended

So the Von Carsteins were pillaging poor Averland and Grenzstadt was lost. Nice dude Karl came by and took it down and returned it to Averland. Then Vlad got **** and took a big ole stack and went to town, retaking Grenzstadt. Nice Karl came to the rescue again but after winning and returning it again to Averland, Karl was hit with a "Attacked Recently Returned Region" Authority penalty? What kind of strange bug is this? Is there a hidden timer? So I should have let the vampires, the orcs and chaos pillage the Empire for more turns before i save them each time?

if faction:has_pooled_resource("emp_imperial_authority") == true and faction:is_human() == true then
	if empire_recently_returned_regions[region_key] ~= nil then
		if empire_recently_returned_regions[region_key].faction == faction_key then
			if empire_recently_returned_regions[region_key].cooldown > 0 then
				cm:faction_add_pooled_resource(faction_key, "emp_imperial_authority", "wh2_dlc13_resource_factor_attacked_returned_region", empire_authority_took_returned_region);
				empire_recently_returned_regions[region_key] = nil;
			end
		end
	end
end

Bug: The intention of this addition to the empire_politics script was to punish players who try to gain +10 loyalty with a faction they are at war with by continuously razing and returning territory with the intention to confederate that faction. The problem is, the script doesn't check wether the previous owner of the region is the elector count. So you'll be punished for capturing the region from a different faction (not the elector count who's supposed to own it) and returning it to its rightful owner (twice in a row). Quite counterintuitive in my opinion.


Saturday 2021-07-24 09:18:41 by sm0kin

Fix for odd behaviour with the Sisters of Twilight's quest chain if Venom Glade is razed by Beastmen.

Hmm, seems to be some odd behaviour with the Sisters of Twilight's quest chain if Venom Glade is razed by Beastmen. The first stage of the quest (embed a Spellsinger) seems to fire just fine, the next stage should pop up (which should be "Send anyone to Venom Glade"), but over the end turn the Sisters just magically get Cethin-Har and the quest doesn't continue.

This is nearly absolutely fine - but their quest is not just the mount but the "United Again" trait that allows them to summon Gwindalor/Cethin-Har (whichever one they're not mounted on), and they don't recieve that.

Would it be possible to write a script that listens for the ancillary/mount change and, if they don't have United Again, give it to them? Here's a Vortex save if you want to investigate. Over the end turn, Taurox will reliably raze Venom Glade and the quest chain will pop with the shenanigans I describe above.

I used Web Campaign Editor (which still seems to work, a little clunkily) to transfer Venom Glade to someone like Bleak Holds, thus keeping the settlement not-razed, and the quest chain continued as usual

by Zeldrake

Bug: Venom Glade being abandoned prefends the followup mission from triggering.


Saturday 2021-07-24 10:14:38 by Ng Ming Sheng

hello boys i think im so good, i love genshin and my gf so much


Saturday 2021-07-24 11:27:55 by Amy

Updated CRC Cards and Remaining Tasks

Just did a quick brain dump of everything I could think of that hasn't been done yet since I'll be out for most of tomorrow and won't be able to help (sorry!!), as well as update the CRC cards to Vivian's link. If there's anything you guys need me to work on I can try get it done tomorrow night, or if you want clarification on anything feel free to message me, I should be able to respond at some point.

Cheers!


Saturday 2021-07-24 12:28:11 by Kshitij Gupta

vendor: notch-city: Add 3 mode display cutout handler [2/3]

[AgentFabulous - POSP]

  • Introduces the HideCutout and StatusBarStock overlay used in the 3 mode display cutout handler. The HideCutout overlay is necessary since we can't register a content observer in the display manager code. We only have access to resources during boot. Thus, leave this as an overlay and let the config and overlay change methods handle this. Though we can probably do statusbar stock height toggling in the SystemUI code without overlays, I kinda got lazy by the end, just live with it god damn it xD

Signed-off-by: Kshitij Gupta [email protected] Change-Id: I62f63f39bcb410cfbc68e0028b9cef3d748d7eb6 Signed-off-by: Arghya Chanda [email protected]

notch-city: Refactor package name

Signed-off-by: ShubhamB [email protected] Change-Id: Ieb8b35a3062c9334e82153a1dd26df3853db4f1f


Saturday 2021-07-24 15:11:08 by Michael McClimon

Slack: rework edit-reply logic and data structures

Before, we stored data on messages we replied to as a hash of $timestamp => $reply_data. This was buggy, because we sometimes reply more than once to a single message, and whatever our last response was wound up overwriting the previous ones.

This had at least one side effect that we noticed all the time. An example: user says something like "synergy >> pat fix thing, ptn 12345", attempting to create a task for pat in reference to a particular Zendesk ticket. We respond to that message twice: once for the LP task creation, once with a link to the Zendesk ticket, and the link reply data would win. This is a problem if the LP task creation causes an error (if synergy doesn't know who Pat is, for example), because the first reply is marked was_error, but the Zendesk link expando is not. That means if the user tried to edit the message (which ordinarily, they'd be able to do), Synergy would say "hey, that message didn't cause an error, I can't deal with your edit, sorry."

My first pass at fixing this was to keep the reply data the same, and set was_error as $existing_reply->{was_error} || $new->{was_error}. That worked, at least to fix the immediate problem, but was inelegant: when synergy replies to an error, she deletes her original error reply (the one that says something like "hmm, that didn't work"), but with this fix, she'd always delete the last one; in the example above, you'd be left with "hmm, that didn't work" and she'd delete the Zendesk link. That kinda stinks.

The actual fix is just: for every message we respond to, keep a list of replies instead. This requires slightly more bookkeeping, but when an edit comes in, we can noticet that only part of it caused an error and delete just that part of our response.


Saturday 2021-07-24 16:19:56 by Bloopsdoob

i feel cold, someone help me

monika dkfajsdfsdl;odd it's 2 in the fucking morning and only now do i come back to you fun. Ice's watching me as i type this out, so that's fun. I added icefreiddne n shit tap e to access him lol idk more in the furutre but not tommorrow i got a deer to hunt today was a productingve day.


Saturday 2021-07-24 17:02:26 by Verwex

modders' bless and curse - extension-webm

don't worry, the instructions should work for you and that error won't show up

(taken from Shit Engine)


Saturday 2021-07-24 17:06:43 by Marko Grdinić

"8:10am. I am up. I was dead tired last night, I actually ended up reading papers for a bit after closing.

https://youtu.be/aNE2UWaOfpQ RL#5: Distributional Reinforcement Learning

Let me just get over my groginess a little and I will watch this lecture. I do not get quantile regression just yet. Yes, I understand it predicts values at different quantiles, but I do not understand how it propagates values as a distribution yet. Nor do I get the specific of the algorithm.

https://deeppavlov.ai/rl_course_2020

It is from here. Also it seems the course is all in Russian. I guess it will be time to figure out how good the translation is.

The course videos seem quite interesting not gonna lie. This is the first time I found a course that is not on baby subjects.

8:35am.

https://youtu.be/aNE2UWaOfpQ?t=760

What is TQC here? Never heard of that. Also I hadn't heard about S51.

8:50am. The translation quality is poor so I am just skimming. I'll be soon done with this. I'll go for for some QR tutorials after this.

https://youtu.be/aNE2UWaOfpQ?t=5831

Here is the algorithm. It seems I missed the earlier paper that had it.

https://youtu.be/aNE2UWaOfpQ?t=6252

Here he is presenting his own work. I've decided - I am going to figure out QR regression before I go back to programming. It does not matter that I won't use it. Right now this is a matter of basic literacy. I'll also check out the papers from video by Dabney.

https://www.youtube.com/results?search_query=quantile+regression

Let me watch some of the vids here. I remember watching this ages ago, but I might as well do it again as I've forgotten everything.

https://arxiv.org/abs/1710.10044 Distributional Reinforcement Learning with Quantile Regression

Let me check out this paper.

https://arxiv.org/abs/1902.03149 Distributional reinforcement learning with linear function approximation

This one has the S51 algorithm whatever it is.

10:05am. I understand about 90% of the QR-DQN algorithm. The idea behind quantile regression is easy - there is a switch that controls how much the network learns on positive vs negative examples. Quite smart.

Right now I only don't understand the very first line of the algorithm. Just how are the q_j derived?

10:25am. Had to take a short break. It came to me during it. Those small q are a probability distro over the quantiles. It can be used set risk sensitivity during action selection. For risk neutral, one should just allocate 100% to the mid quantile. It is not a big deal.

And with that I think I understand it all.

The Hubert loss is not key to understanding the algorithm, I can just imagine that part as being square loss instead.

And I understand the abs Dirac switch trick. So that makes things quite clear.

Thought the QR get better performance, I think the categorical algo has superior properties.

It is not true that it is not necessary to set bounds for the value net like in the categorical version. You do that sort of thing through the learning rate which is much worse than simply determining the number of categories.

Also the adaptive learning rate optimizers will not speed things up when the reward scale is big and slow it down when it is small. In fact it is quite the opposite!

If I go the categorical route, I won't have to worry at all about putting the action predictor next the value estimator since the gradients won't be dependent on the scale of the rewards. That is wonderful.

I wish I could have come up with C51 myself, but I never figured out how to do those projection operations and how to distribute the rewards to different binds. I came up with 30% of it roughly and I was missing inspiration. I wish I had not studied DRL in 2018 because my shallow understanding from back then actually hindered me from doing the right thing in 2021.

Let me watch the video by Dabney again.

10:35am. Oh righ, let me first take a look at the S51 paper.

https://arxiv.org/abs/1902.03149 Distributional reinforcement learning with linear function approximation

It is this one.

11am. It is not a big deal. Let me take a look at the video. There was also another method called mean value discrepancy or something like that which I am not familiar with. After I deal with that, I'll take a look at some of the papers he talked about yesterday.

https://youtu.be/iqIGHSgYtbs?t=137

What is C51-IDS.

https://arxiv.org/abs/1812.07544 Information-Directed Exploration for Deep Reinforcement Learning

http://proceedings.mlr.press/v139/cobbe21a.html Phasic Policy Gradient

We introduce Phasic Policy Gradient (PPG), a reinforcement learning framework which modifies traditional on-policy actor-critic methods by separating policy and value function training into distinct phases. In prior methods, one must choose between using a shared network or separate networks to represent the policy and value function. Using separate networks avoids interference between objectives, while using a shared network allows useful features to be shared. PPG is able to achieve the best of both worlds by splitting optimization into two phases, one that advances training and one that distills features. PPG also enables the value function to be more aggressively optimized with a higher level of sample reuse. Compared to PPO, we find that PPG significantly improves sample efficiency on the challenging Procgen Benchmark.

This is interesting too. I found the paper by accident.

https://arxiv.org/abs/2106.05526 Simplifying Deep Reinforcement Learning via Self-Supervision

Supervised regression to demonstrations has been demonstrated to be a stable way to train deep policy networks. We are motivated to study how we can take full advantage of supervised loss functions for stably training deep reinforcement learning agents. This is a challenging task because it is unclear how the training data could be collected to enable policy improvement. In this work, we propose Self-Supervised Reinforcement Learning (SSRL), a simple algorithm that optimizes policies with purely supervised losses. We demonstrate that, without policy gradient or value estimation, an iterative procedure of ``labeling" data and supervised regression is sufficient to drive stable policy improvement. By selecting and imitating trajectories with high episodic rewards, SSRL is surprisingly competitive to contemporary algorithms with more stable performance and less running time, showing the potential of solving reinforcement learning with supervised learning techniques. The code is available at this https URL

These are some latest papers citing the C51 one from Google Scholar.

11:10am. I'll read them later, let me go back to the video.

https://arxiv.org/abs/2007.12354 Distributional Reinforcement Learning via Moment Matching

This is the MMD paper.

https://youtu.be/iqIGHSgYtbs?t=1720

I'll have to keep these papers in mind.

12pm. Done with the video.

https://arxiv.org/abs/1812.07544 Information-Directed Exploration for Deep Reinforcement Learning

http://proceedings.mlr.press/v139/cobbe21a.html Phasic Policy Gradient

https://arxiv.org/abs/2007.12354 Distributional Reinforcement Learning via Moment Matching

These 3 papers interest me the most. Let me get breakfast and I will get to them. After that, I'll start work on poker again. I am 90% certain that using distributional RL will be enough to get me through the current bottleneck. Especially given that I am going to get some good exploration as well.

12:40pm. I am amazed that somebody came up with an IDS algorithm for deep RL. I tried that and nothing came to mind.

12:50pm. Hmmm, it is somewhat complicated. I am not sure that this would work better that the algorithm in Estimating Risk and Uncertainty in Deep Reinforcement Learning.

The only thing it does significantly different compared to there is multiplying by the aleatoric uncertainty rather than just using the epistemic one in the denominator.

Putting the mean, the epistemic uncertainty and the aleatoric one into the normal function would be easier than implementing this.

1:25pm. I am thinking that considering aleatoric uncertainty when doing exploration makes no sense. The SAU paper did it, but in there the weights were a proxy for epistemic uncertainty. When calculated from variance, the epistemic uncertainty estimate in deep nets already reflects the scale of the noise.

1:30pm. Let me move to the Distributional Reinforcement Learning via Moment Matching paper.

1:35pm. One good idea that I had for calculating variance of two cat distros is instead of doing sum(z^2*(p-p')) where p and p' are probability vectors from two different nets and z is the support, I should instead do sum(z^2*|p-p'|). This will ensure that the variance is zero only if the vectors match exactly.

2:05pm. Done with the MMD paper. It is not too complicated. Let me read the Phasic Policy Gradient paper.

2:10pm. The MMD idea is remarkable, but it is not something I'd be interested in using at the moment.

By decoupling policy and value function training, PPG is able to reap the benefits of additional value function training without significantly interfering with the policy. To achieve this, PPG does introduce several new hyperparameters, which creates some additional complexity relative to previous algorithms

Forget this.

Hyperparams are the devil.

2:25pm. I am done. Let me take a nap. I was reading all the way through breakfast and my mind needs to slow down for a bit.

4:05pm. https://www.youtube.com/watch?v=zFffYuDGslg MMD, Kernel Trick and Deep Learning

Let me watch this and then I am going to shift myself into a programming mindset. I am impressed how well the MMD trick works for deep RL. It makes me curious to know more.

https://youtu.be/zFffYuDGslg?t=632

I do not get this completely, but I guess that explains their popularity.

4:25pm. Ok, let me do something for the day. I do not feel like just stopping here.

4:55pm. I feel like reading 'An Optimistic Perspective on Offline Reinforcement Learning'.

5:15pm. Hmmmm, I could combine the convex combination idea with categorical DRL. I keep thinking how to calculating the epistemic uncertainty of a bunch of heads, but why do that at all? I can just sample the values of each action and then take the argmax of that. If I take the convex combination of heads, that will have the benefit of never having the individual heads lag, and will act as a posterior sampling algo.

Yeah, I am feeling it. This is the way to do exploration in poker. It will be a strong method.

5:30pm. Running multiple nets at the same time and my various other ideas all have weaknesses. Only this one has the ideal combination of various features.

Though, one issue is that it would be affected by aleatoric uncertainty.

What aleatoric uncertainty though? With full hindsight that factor goes away.

5:40pm. If I wanted to just exploit epistemic uncertainty, I should calculate the mean and get the max of that. That will do the trick. There is no need to struggle more deeply.

5:45pm. No that would lose diversity. Forget that. Since I have full hindsight I'll go with the idea I had. In regular situations without full hindsight I'd need to train an action predictor to keep track of which actions are infrequent in which situation.

...Agh, but that would direct the agent towards noisy rewards in some cases. I guess there is no easy way to do this.

The only proper way to exploit purely epistemic uncertainty would be to have a lot of heads. That would give hope that delusions do not arise in the agent.

6:35pm.

inl vs_self game (batch_size, p) =
    let rec loop (l : a u64 _) =
        inl rewards : ra u64 _ = am.empty
        inl actions_indices : ra u64 _ = am.empty
        inl data : ra u64 _ = am.empty
        inl nexts : ra u64 _ = am.empty
        l |> am.iteri fun i => function
            | Action: player_state,game_state,pid,actions,next =>
                rm.add actions_indices i
                rm.add data (player_state,game_state,pid,actions)
                rm.add nexts next
            | Terminal: x =>
                rm.add rewards (i, x)
        inl rewards_actions : a _ _ =
            if 0 < length data then
                inl (cs : a _ _),update = p data
                am.generic.map2 (<|) nexts cs |> loop |> update
            else am.empty
        inl rewards_all : a _ r2 = create (length l)
        am.generic.iter2 (set rewards_all) actions_indices rewards_actions
        am.iter (fun (i,_,_,r) => set rewards_all i r) rewards
        rewards_all
    loop (am.init batch_size fun _ => game () pl2_init)

Done with lunch. I am thinking how I am going to rewrite all the stuff in train.spi. It will require a pretty significant redesign. The rewards will be cuda tensors. I'll have to do some indexing magic to make that work.

But it should be doable. It won't be overly hard once I lock into it.

6:40pm. Right now I am thinking how to make action selection that just takes into account the epistemic uncertainty.

I think I am going to get the max over all the bootstrap vector action probs minus the min. Divide that by the number of elements in the vector, multiply by the support, sum it up and I have my epistemic uncertainty measure.

...Though that kind of thing would be seriously expensive. While just sampling the value would only take the convex combination, here I'd need to multiply all of them. Rough on the GPU.

I'll try that after everything else is done. If it does not improve performance then I guess it does not matter for poker in particular.

6:50pm. I am dying with this espistemic uncertainty crap. Instead of the whole thing, I could just calc the variance off two convex combinations. No need for the whole thing. Yeah, this is the wining ticket.

7pm. Let me close here. Tomorrow I am going to deal with propagating vector rewards through the training loop. After that what I will have to deal with on the Python side won't be too much. These algorithms aren't particularly complicated. Maybe it was different in 2018, but the situation is different now. I am a bit overdue on that elite agent, but I am going to get this to work by the end of the month."


Saturday 2021-07-24 18:23:04 by PatTheHyruler

Voting stuff and attack stuff

I think I did a bunch of stuff but at this point I finally got things working after hours of hunting down some idiotic bugs, so I really cba to remember and write down what I added


Saturday 2021-07-24 19:16:02 by BonkeyDonk

goatfuckers cunts eat your own cum-filled-shit and go fuck yourself all


Saturday 2021-07-24 20:04:48 by coalpha

CreateProcessW -> ShellExecuteW

For whatever absolutely-cursed-for-no-goddamn-reason reason, launching libqt executables with CreateProcess causes QFileDialog to show the qt version instead of the native version. WHO THE HELL IS RESPONSIBLE FOR THIS ABSOLUTE GARBAGE? Anyways, there was one last trick up my sleeve and that is ShellExecute. Is there a good reason for why? No. Does it do what I want? Yes, moderately. It's safe to say I'm going to stay far away from this lib in the future.


Saturday 2021-07-24 20:21:58 by teidesu

fix(tl): updated schema

god i love when telegram changes schema in a backwards-incompatible way and doesn't change layer number because they dont give a slightest fuck on third party mtproto implementations


Saturday 2021-07-24 21:39:57 by Ryan Hawkins

Update daemon networking strategy and refactor

Previously, the daemon was only listening for network create and destroy events. This meant that the proxy would never actually leave a Docker network it was a part of, as a network can only ever be destroyed if it has no containers on it. This new strategy has two parts to it:

  1. When the daemon is first started, it performs an initial sync. This ensures that the proxy joins all of the networks it is supposed to. Otherwise, the daemon would connect the proxy to networks until a user started doing things in Docker land.
  2. Listen to the network events connect and disconnect, and then simply syncs again, using the same methodology as specified in sync. While this is technically more expensive than determining whether to join or leave a network as connect's and disconnect's happen, it's also much more fault tolerant, as if one run of the daemon somehow misses an event or an error happens only in one run, then the next run of the daemon will catch it. It also requires much less code, and is exactly how the dinghy-http-proxy does its network autojoin magic.

This new methodology for the daemon should ensure that the proxy stays in sync. What I both love and hate is a) the fact that I couldn't improve on the dinghy-http-proxy way of doing things and had to copy their code and b) that the dinghy-http-proxy already had a really good way to solve this problem. Thanks FreedomBen!


Saturday 2021-07-24 22:09:54 by Curtis Elsasser

Some things are going to be moved around. Some things are going to be revised. Going to carve out of slice of tiny-MIDI-core. We will see what face he develops as he works his way through the stages of maturity.

  • creating a simple symbol table
  • dusting off the shelves
  • unit test the symbol table

My mind ain't in it. Will follow up tomorrow.

  • wasting some time to fall in love with enums. Failed

  • support symbol table case variations of string indexes and enums

  • reorg

  • a little dinner thinking and follow up


Saturday 2021-07-24 22:33:31 by leandro-brandao96

Update leandro-brandao96_Playlist_Public.m3u

Update IPTV Links Update Anime "My Hero Academia" Episódio 105 Update Anime "Shaman King 2021" Episódio 17 Update Tv Show "The Flash" S07E18 Update Tv Show "Kung Fu 2021" S01E13 Update Tv Show "Legends of Tomorrow" S06E10 Added Movie "Midnight in the Switchgrass" Added Movie "Chernobyl Abyss" Added Movie "The Forever Purge" Added Movie "Jolt" Added Movie "Fondeados" Added Movie "Blood Red Sky" Added Movie "The Last Letter from Your Lover" Added Movie "Kingdom: Ashin-jeon" Added Movie "Playing with Sharks: The Valerie Taylor Story" Added Movie "Stuntman"


< 2021-07-24 >