Reply
Thread Tools
Importing many matrices into MATLAB math, software
Old 02-26-2010, 12:14 AM   #1
Latro
Veteran Member [85%]
 
MBTI: INTP
Join Date: Apr 2009
Posts: 3,414
 
I'm doing an assignment about ranks of random matrices. So I made the matrices in Python (not the best choice, but it took about a second, so I think I'm ok), manipulated them into the form that I would usually use to put into MATLAB, and have them in a text file. They're sorted like this:
[0, 0, 1; 0, 1, 0; 1, 1, 0]
[0, 1, 0; 1, 0, 1; 1, 1, 1]
etc., except there are quite a few of them (one file has 512 of them; the other has 32768), which makes manual entry not practical.

All my attempts at googling for a solution to this problem have failed; the guides seem to be too dense for me to actually find what I'm fishing for. I'm a major newb with MATLAB, only knowing a few minor tidbits like rref() and rank(), but this seems like it should be really easy, and yet doesn't seem to be. Can anyone help?
Latro is offline
Reply With Quote

Old 02-26-2010, 06:56 AM   #2
shytiger
Member [22%]
MBTI: INTJ
Join Date: Sep 2009
Posts: 895
 
I'm surprised you didn't post this to the matlab forum. There are some ways. One is textscan. C = textscan(fid, "[%d, %d, %d; %d, %d, %d; %d, %d, %d]",512) which will read them all into a cell array. Then read the cell array elements into matrices.
shytiger is offline
Reply With Quote
Old 02-26-2010, 06:59 AM   #3
Latro
Veteran Member [85%]
 
MBTI: INTP
Join Date: Apr 2009
Posts: 3,414
 

  Originally Posted by shytiger
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
I'm surprised you didn't post this to the matlab forum.

Could have probably done that...didn't pop into my head (it was 3 AM and I was frustrated at how this wasn't being cooperative). I'm trying that now, thank you.

---------- Post added 02-26-2010 at 10:46 AM ----------

When I do that without spaces between the %d (there are spaces in my file), I get 9 512x1 cell arrays. When I do it with the spaces I get 9 0x1 cell arrays. Not exactly sure what I'm doing wrong here.

---------- Post added 02-26-2010 at 10:48 AM ----------

To reiterate, what I did was:
T = fopen('threes.txt')
C = textscan(T,'[%d,%d,%d;%d,%d,%d;%d,%d,%d]',512)
and I got 9 512x1 things instead of 512 3x3 things. What happened?

Latro is offline
Reply With Quote
Old 02-26-2010, 11:32 PM   #4
nanotube
Member [03%]
 
MBTI: INTJ
Join Date: Feb 2010
Posts: 140
 
Ah, this is why I joined this forum... tips for scripting MatLab. The funny thing is, this will be useful to me on Monday... because I also need to import a matrix into Matlab.
nanotube is offline
Reply With Quote
Old 02-27-2010, 11:32 AM   #5
Malle
Member [06%]
 
MBTI: INTJ
Join Date: Dec 2008
Posts: 256
 
If you aren't required to import it, MATLAB is perfectly capable of generating large amounts of random matrices.

Depending on your version of MATLAB, randint or randi should cover that.
Malle is offline
Reply With Quote
Old 02-27-2010, 02:50 PM   #6
Latro
Veteran Member [85%]
 
MBTI: INTP
Join Date: Apr 2009
Posts: 3,414
 

  Originally Posted by Malle
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
If you aren't required to import it, MATLAB is perfectly capable of generating large amounts of random matrices.

Depending on your version of MATLAB, randint or randi should cover that.

I'm just spanning all the cases; the assignment is about random matrices, so one way to attack it is (since the relevant set is finite) to just check all the cases. My generation method was in Python, doing it this way:

def GenerateIsomorphisms(size): #generates 1D isomorphisms of matrices of total size size.
isos = [[]]
for n in range(size):
for x in range(len(isos)):
isos.append(isos[x]+[1])
isos[x].append(0)
return isos
def linetomatnine(isos): #converts a 9 element array into a 3x3 matrix in MATLAB format.
for iso in isos:
iso = list(str(iso))
iso[8] = ';'
iso[17] = ';'
isostr = str()
for x in iso: isostr += x
print(isostr)
if __name__ == '__main__':
print(linetomatnine(GenerateIsomorphisms(9)))

I wouldn't know how to build them in MATLAB. That would work just as well, I just have no idea how to do it.
Latro is offline
Reply With Quote
Old 02-27-2010, 03:14 PM   #7
Marcus
Member [25%]
Do not feed the bear!
MBTI: INTJ
Join Date: Apr 2008
Posts: 1,027
 

  Originally Posted by Latro
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
[0, 0, 1; 0, 1, 0; 1, 1, 0]
[0, 1, 0; 1, 0, 1; 1, 1, 1]

I'd let my Python script print out a Matlab code directly, like:
A1=[0, 0, 1; 0, 1, 0; 1, 1, 0];
A2=[0, 1, 0; 1, 0, 1; 1, 1, 1];

...

r1=rank(A1);
r2=rank(A2);

etc...

Marcus is offline
Reply With Quote
Old 02-27-2010, 03:43 PM   #8
Latro
Veteran Member [85%]
 
MBTI: INTP
Join Date: Apr 2009
Posts: 3,414
 

  Originally Posted by Marcus
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
I'd let my Python script print out a Matlab code directly, like:
A1=[0, 0, 1; 0, 1, 0; 1, 1, 0];
A2=[0, 1, 0; 1, 0, 1; 1, 1, 1];

...

r1=rank(A1);
r2=rank(A2);

etc...

Ah...that's interesting. In the 32768 matrix case, though, that would be kind of an inefficient way to do it, would it not? Since I'd have to copy/paste 60,000+ lines?

Still, that's probably a pretty good way to do it since I can't seem to figure it out otherwise.

---------- Post added 02-27-2010 at 09:56 PM ----------

Well doing it that way seems to be working, although the output of the python file in the 32768 case is like 360,000 lines and is being very slow in MATLAB.

Latro is offline
Reply With Quote
Old 02-28-2010, 04:18 AM   #9
Marcus
Member [25%]
Do not feed the bear!
MBTI: INTJ
Join Date: Apr 2008
Posts: 1,027
 

  Originally Posted by Latro
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
Ah...that's interesting. In the 32768 matrix case, though, that would be kind of an inefficient way to do it, would it not? Since I'd have to copy/paste 60,000+ lines?

You shouldn't do copy/paste. You just have to give a .m extension to your output file and then run it under Matlab.

Marcus is offline
Reply With Quote
Old 02-28-2010, 05:29 AM   #10
Latro
Veteran Member [85%]
 
MBTI: INTP
Join Date: Apr 2009
Posts: 3,414
 

  Originally Posted by Marcus
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
You shouldn't do copy/paste. You just have to give a .m extension to your output file and then run it under Matlab.

Yeah, I'm doing that now. I ran the large case overnight, however, and it's apparently still running, 8-9 hours later. I suspect that I should have done it differently, by instead defining a big array and iterating through it. It'd still take a long time, no way around that, but at least it wouldn't be 360,000 lines.

Latro is offline
Reply With Quote
Old 02-28-2010, 07:17 AM   #11
tp6626
Core Member [108%]
Curmudgeon, miser, CAD advisor!
MBTI: iNTj
Join Date: Apr 2008
Posts: 4,345
 
Can't you just manipulate your text input file using code to append and incrementing "A1 = " in front of each line? And then call that modified text file in your matlab code?

That kind of thing shouldn't take 8-9 hours, so I'd guess you've set up some kind of circling loop in your previous attempt.

Excel would even handle this type of operation for 32000 rows of data (and without any real coding).

You'd:

1. Use excels data import tool on your text file, and import each line as a text string into its own cell.

2. Insert a new column to the left of your column of text strings.

3. Type in cell A1 a text string of "A1 = ", and in cell B1 "A2 = ".

4. Auto-fill that row downwards to the end of your data range.

5. In cell C1 type "=A1 & B1"

6. Auto-fill that row downwards to the end of your data range.

7. Copy the whole of column C, and then do a "Paste Special, Values only" into a column A.

8. Delete all your other columns so that you are only left with the one column with your full text string.

9. Save that as a a CSV from excel, but rename the extension .txt. or .m or whatever afterwards.

10. Call that text file from matlab at the relevant part in your code.

I'm not familiar with matlab, but in Ansys (which is loosely based on fortran), you can at any point in a procedure call external procedures saved as text (.inp) files.

So at the point where you want to define these arrays, you'd just type the filename (making sure the input file is actually in the working directory, that is).

I assume that a .m file is some sort of macro or input file that can be used in matlab in this way.
tp6626 is online
Reply With Quote
Old 02-28-2010, 07:34 AM   #12
Latro
Veteran Member [85%]
 
MBTI: INTP
Join Date: Apr 2009
Posts: 3,414
 
I got it running as it stands, and the method is exactly the same as the case for 512 3x3 matrices, which took about 20-25 seconds. The code method is fine, albeit slow. There is a significant chance that I have been disconnected from the server that I'm running the code on (I don't have MATLAB at home), however if this is the case I'm not seeing that at the moment; the terminal is just sitting there blinking, and it's been over 10 hours. There aren't any loops at all in the code, so I don't think anything like that could be the problem.

As it stands what I did was this:
Make the 4 rank counters 0.
A = matrix
r = rank(A)
Elseif chain to increment the correct rank counter
Do the same thing 32768 times with different matrices. (This makes a text file with over 360,000 lines that I put into MATLAB).
Type in the names of the rank counters to have them return the final values.

I suspect to make MATLAB's interpreter happy what I should do is instead:
Make the 4 rank counters 0.
A = [matrix1,matrix2,...,matrix32768]
for matrix=A
r = rank(matrix)
elseif chain to increment the correct counter
end
Type in names of rank counters.

Which would be about 10 lines, with the first line being an extremely long "line".

I suspect an easier way to write the Python would be to have it print like:
[
matrix
...
matrix
]
and then transpose it so I could iterate over the columns.
Latro is offline
Reply With Quote
Old 02-28-2010, 12:09 PM   #13
tp6626
Core Member [108%]
Curmudgeon, miser, CAD advisor!
MBTI: iNTj
Join Date: Apr 2008
Posts: 4,345
 
Why have you been set this again?

I'm not massively familiar with matrix functions, and just reading up on the computation of ranks now, it appears that there are various methods, of varying computational efficiency.

I am now wondering whether you have been set both a small and a significantly larger data-set, with the intention that you are supposed to stumble across this 8-10 hour problem, and are expected therefore to come up with a more efficient way of undertaking the same computation???

Here:

 
When applied to
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
computations on computers, basic Gaussian elimination (
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
) can be unreliable, and a rank revealing decomposition should be used instead. An effective alternative is the
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
(SVD), but there are other less expensive choices, such as
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
with pivoting, which are still more numerically robust than Gaussian elimination. Numerical determination of rank requires a criterion for deciding when a value, such as a singular value from the SVD, should be treated as zero, a practical choice which depends on both the matrix and the application.

Maybe look into checking which method matlabs rank(A) function utilises, and then possibly look at programming by an alternate method, which is more suited to your particular problem.

---------- Post added 02-28-2010 at 08:19 PM ----------

There also seem to be loads and loads of properties of the rank of a matrix, that I suspect would allow significant optimisation of your program (that's only an intuitive hunch though, I haven't looked into it in great depth).

And there are possibly also things you can do to capture the rank values on the fly, without actually committing any intermediate decompositions of the input matrix to memory. Again, I'm just throwing ideas out here; again haven't put a massive amount of thought into it.
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.

tp6626 is online
Reply With Quote
Old 02-28-2010, 12:26 PM   #14
Latro
Veteran Member [85%]
 
MBTI: INTP
Join Date: Apr 2009
Posts: 3,414
 

  Originally Posted by tp6626
To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
Why have you been set this again?

I'm not massively familiar with matrix functions, and just reading up on the computation of ranks now, it appears that there are various methods, of varying computational efficiency.

I am now wondering whether you have been set both a small and a significantly larger data-set, with the intention that you are supposed to stumble across this 8-10 hour problem, and are expected therefore to come up with a more efficient way of undertaking the same computation???

Here:

Maybe look into checking which method matlabs rank(A) function utilises, and then possibly look at programming by an alternate method, which is more suited to your particular problem.

The original problem (rephrased since I don't have my book here atm):
If a 3x3 matrix is randomly constructed from 0s and 1s, what are the most likely dimensions of the four subspaces? (This is determined uniquely by the rank and the size of the matrix). What about a 3x5 matrix?

The idea here is just to build intuition about likely sizes of matrices and to look at ways to investigate random matrices (which are very important in applications, and much more open than many other problems in linear algebra) in general.

The 3x3 matrix has 2^9=512 possible cases (9 positions, 2 choices each), and I've already analyzed that; the solution was 2, with 1 rank 0, 49 rank 1, 288 rank 2, and 174 rank 3. The 3x5 case unfortunately has 2^15=32768 cases (15 positions, 2 choices each).

The actual method of decision and proof (and the professor did say that we need to prove our conjecture) is completely open. Initially I tried a combinatoric approach, but there was unfortunately a lot of issues with double counting in my approach. In the 3x3 case for example I got 1 rank 0, 49 rank 1, and less than 210 rank 3, with 512 all in all; this forces rank 2 to be the most common. I can almost certainly say that rank 3 is most common in the 3x5 case, because adding columns to a matrix of a given rank r can only increase its rank if it affects it at all.

You are right, however, that when you know for a fact that everything is an integer you could code the operations to be simpler if you tried. But no, this is not a numerical analysis class, so I don't think fiddling with coding your own rank estimation algorithm is really expected.

One thing I really do know, though, is that MATLAB is very good at doing things as vector and matrix operations whenever possible. It takes time to parse the sheer SWATH of lines that it is faced with when each line is presented on its own, but it has lots of really good optimization routines for iterating through an array.

Latro is offline
Reply With Quote
Old 02-28-2010, 05:49 PM   #15
Malle
Member [06%]
 
MBTI: INTJ
Join Date: Dec 2008
Posts: 256
 
Let's note that limiting us to matrices of a given size, a matrix can be uniquely identified by a row vector which is simply the rows concatenated.

For instance, if we analyze 3x3 matrices, the matrix
1 0 1
1 1 0
0 0 1
can be uniquely identified with the row vector 1 0 1 1 1 0 0 0 1. (This we can in turn have represent a binary number, which I will use in one of my solutions).

Seeing this you could store your output in a text file as only numbers with some delimiter and use importdata(filename) to import the whole set to one matrix. You could then modify that 2-dimensional matrix to a three dimensional matrix where each layer is one of the matrices that could be generated.


Here's how I did it:

It is clear by the above note that to study all the nxm matrices where the elements are chosen as 0 or 1 independently and uniformly we can study the binary representation of the matrices, which will consist of all binary numbers with n*m bits.

The first method I used to solve this was to build a matrix with all the representing row vectors, then converting that 2-dimensional matrix to a 3-dimensional matrix where the third dimension was used to differ between matrices. Since I couldn't find any function in MATLAB to calculate the rank independently of each matrix as I had them represented, I just looped through the third dimension and calculated the rank for each matrix on its own. This method requires a lot of memory to hold all the possible matrices at once, but is relatively fast when the memory is sufficient.

The second method I used was to use the binary representation. Since a binary number with N bits can at most represent a 2^N-1 in decimal, a maximum value is easy to calculate. I loop through all the possible decimal values and for each I convert the value to a binary number, map the binary number to the corresponding matrix and then calculate the rank of it. Since this never stores the full set of matrices it requires much less memory, but it uses loops and is thus a lot slower, especially since the amount of numbers to loop through grows as 2^(n*m) in an nxm matrix.

Links to the files:

To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
- the faster one

To view links or images in this forum your post count must be 2 or greater. You currently have 0 posts.
- the slower one

Unless you can't run the first one I severely doubt you have any use for the latter; while it takes me less than a second to run for 3x3 and less and 32 seconds for 4x4, I estimate it would take somewhere on the order of a day or so to run a 5x5. The fast method took me less than 2 seconds to run for 4x4 and less.

Note: These were written in MATLAB 2009b; I cannot guarantee they will work in any other version.
Malle is offline
Reply With Quote
Reply

Tags
math, software

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 02:58 PM.


Powered by vBulletin®
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Myers-Briggs Type Indicator, Myers-Briggs, and MBTI are trademarks or registered trademarks of the
Myers-Briggs Type Indicator Trust in the United States and other countries.