-
-
Notifications
You must be signed in to change notification settings - Fork 305
Support for reducible Markov chains #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit makes a number of changes including adding support for calculating stationary distributions on reducible Markov chains inline with QuantEcon/QuantEcon.py#79 *Use Graphs.jl to determine the irreducible subsets of a reducible Markov Chain *Changed DMarkov to MarkovChain in line with mc_tools.py *Move to Distributions.jl for sampling *Removed support for ``Matrix`` types: this is not really necessary, but I think we should either pick matrices or MarkovChain types rather than maintaining support for both.
@ZacCranko Well done for pulling all this together. It's really useful for analysis of Markov chains, and quite unique. The code looks very nice. @spencerlyon2 Zac has written this to bring the jl side up to date with the new functionality provided by @oyamad 's Markov chain / network algorithms. Now might not be a great time for you to look at code. We can wait on you if you like. There's no great rush to get this merged. |
Hi @ZacCranko I wanted to try your code, but failed to import this version of QuantEcon. Let me report what I did and got:
On IPython notebook, I ran
The error
|
@oyamad, this is a problem on the current master branch of QuantEcon. I have fixed it in 0f58183. @ZacCranko you will need to rebase off of this master branch to have the changes propagate into your branch. |
samples = [init] | ||
for t=2:sample_size | ||
last = samples[end] | ||
push!(samples,rand(dist[last])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason you use push!
instead of preallocating the array up front and filing entries in?
If there isn't a good reason we should change 132-135 to
samples = Array(Float64, sample_size)
samples[1] = init
for t=2:sample_size
samples[t] = rand(dist[samples[t - 1])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand push!()
is as fast (and sometimes even faster) than preallocating, and much more elegant in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you do a quick/simple benchmark to compare the two?
If it isn't a performance hit, I'm happy to leave it as you wrote it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is some code to do a quick/simple benchmark
function testpush(n::Int)
ret = Array(Float64, 1)
for i=1:n
push!(ret, 2.)
end
return ret
end
function testprealloc(n::Int)
ret = Array(Float64, n)
for i=1:n
ret[i] = 2.
end
return ret
end
testpush(10000)
testprealloc(10000)
@elapsed testpush(10000000)
@elapsed testprealloc(10000000)
My computer is showing preallocate was a hair faster, but relatively minor differences. This is just quick and dirty code, but I think it captures the difference you guys are talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that @cc7768 fairly similar performance here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cc7768 thanks for the benchmark.
Here are my results using a timeit
macro that mimics the ipython timeit:
julia> N = 10000000
10000000
julia> @timeit testpush(N)
1 loops, best of 3: 105.78 ms per loop
julia> @timeit testprealloc(N)
10 loops, best of 3: 43.01 ms per loop
I'm seeing a factor of just over 2x slowdown for push!
compared to preallocation.
At the end of the day we are talking a few milliseconds here, so it isn't a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's up to you @spencerlyon2, I don't mind either way.
@ZacCranko . I've had a quick read through your code and wanted to say it looks great. I see a few things we would want to decide on/add before merging. Some of these are already mentioned above, but I thought listing them here in one place might be helpful (feel free to edit this message and add/check off things as they come up):
If I have missed anything else please add it to the list. One more comment that doesn't require a TODO item: Julia base is finally getting a system for documenting library code. It is only available on the dev version of julia for now, but we will eventually want to implement our within-code documentation according to their conventions. The relevant issue is here. I have not included any docstrings in QuantEcon as of yet because I have been waiting for this to get resolved. Given that you are providing documentation-esque comments, I just wanted to make you aware of this because we will have to change these at some point down the road. |
@spencerlyon2 @oyamad Regarding the definition of a Markov chain / process, in general a dynamical system has two elements, a space and a map that acts on the space. The space is an essential part of the definition. A Markov process on a general space is similar. By changing the underlying state space we can change the stability properties of the process, so we need to specify it carefully. However, in the case of a finite Markov chain with n states, the state space is always isomorphic to 1, ..., n (or 0, ..., n - 1 if you prefer). For this reason my preference is that we keep the state space out of the definition of the object. In terms of practicality this works well. For example yesterday I was simulating Markov chains for a paper using the Python qe code. The state values were stored in an array Of course that's all just personal preference, but anyway that's my opinion. The initial condition can also be left out in my opinion. |
In many applications, you will be carrying these different fields together, Since, as you say, the state is always isomorphic to [1,n], then why not On Wed, Nov 19, 2014 at 4:00 PM, John Stachurski [email protected]
|
@albop That seems like a sensible suggestion. It would be nice to keep the Python and Julia sides in sync so why don't we try to get this code merged without the state space being an attribute and add an issue to the issue tracker about adding states to both Python and Julia MC code down the road (defaulting to [0, n-1] and [1, n] respectively). Regarding names I prefer that we just stick with what we have and add the state to the existing class / type. |
I didn't pay attention to the state space issue taking it granted that the state space is always naturally identified with {0, ..., n-1} (or {1, ..., n}), and so I agree with @jstac on his previous comment. But it also makes sense to carry the state space as @albop says. One possible case is when one wants to obtain the sub-Markov chain corresponding to a recurrent class of a reducible Markov chain. In the current implementation, the state space of the sub-Markov chain will be (implicitly) re-indexed as {0, ..., m-1} (where m is the number of elements of the recurrent class in consideration). (Of course one can keep carrying the original indices of the recurrent class along with the mapping Note that the same argument applies to the In NetworkX, it seems that a node can be given any name: |
@oyamad That's a good point (regarding sub-chains). I think we're in agreement that the state space should be added, defaulting to the natural index set if state values are not specified. @oyamad Would you mind to either make those changes on the Python side or note them as to-do enhancements in the issue tracker? @ZacCranko Perhaps it's easier to merge this as is, without addressing inclusion of the state space. That can be done later. Would you please update me on where the code is at vis-a-vis @spencerlyon2 's comments? |
@jstac @spencerlyon2:
Two new things:
Could someone please let me know the best way for me to hand over these changes? submit a new pull request? After this one is in I'll make a new pull request, modifying the |
@ZacCranko great work on the changes. However, I'm not seeing them here. Can you push them to this branch? I'd prefer to wait on the merge until we have those things working. |
Forgot I needed to commit them to my fork, sorry about that. Should be all sorted now. |
@spencerlyon2 I'll leave you to merge when you're happy. |
Sorry @ZacCranko , I still don't see the updates to tests, REQUIRE, or using. Am I missing something? |
Ah no, it's just me being a bit incompetent. Have a look now should all be there. |
Haha no worries, thanks for pushing them. I'm seeing that tests are failing: https://travis-ci.org/QuantEcon/QuantEcon.jl/jobs/41776822#L303-L321 Do you know why this might be happening? |
Maybe just a matter of pulling the latest commits down to this fork? |
@ZacCranko I have a suggestion. Let's merge this into a branch other than master in QuantEcon/QuantEcon.jl and then I can help you get these last few things sorted out. Does that work for you? Thanks for being patient with this |
No problem @spencerlyon2, I'm not that hot at this forking/merging stuff, so any help would be appreciated. Just let me know what you need me to do. |
I have done the following (just for reference so we know how this happened): # pull latest from this repo
git pull origin master
# Created a new branch (based off most recent master)
git checkout -b markov
# Added your repository as a remote
git remote add zcranko [email protected]:ZacCranko/QuantEcon.jl.git
# Pulled from your master branch into my markov branch
git pull zcranko master
#Pushed my markov branch to QuantEcon/QuantEcon.jl
git push origin markov I then opened a new pull request from that branch. I have a hunch that with the latest changes to the master branch here included with your changes the tests will pass. Let's wait a few minutes and see what travis tell us. |
See #21 . I think we are good to go. @ZacCranko I'll give you a chance to have one more look at it and then we can merge away! Thanks for working through this |
Thanks for all your help @spencerlyon2. It looks good to me, if you're happy feel free to go ahead. |
No problem. Good work. Next time it won't be so difficult. For future reference. I think it would have worked from this original pull request if you had run |
Good to know. Should I re-fork, and add the state parameter to the type definition now? |
What do you get when you run |
|
I'll respond in email so we don't keep spamming the group... |
This commit makes a number of changes including adding support for
calculating stationary distributions on reducible Markov chains inline
with QuantEcon/QuantEcon.py#79.
Markov chain
DMarkov
toMarkovChain
in line withmc_tools.py
Matrix
types: this is not really necessary,but I think we should either pick matrices or
MarkovChain
types ratherthan maintaining support for both. If we wanted to stick with matrices, we could check for the validity of the stochastic matrix using a macro.