Timing
De WikiLICC
Testando vetorização:
! http://goparallel.sourceforge.net/optimizing-loops-vectorization/ program Vectorization use portlib real(4),dimension(:),allocatable :: x,y,z integer :: len=150*1024*1024 ! 154 MiB=150MB real(4) :: timing allocate( x(len) ,stat=ierr) allocate( y(len) ,stat=ierr) allocate( z(len) ,stat=ierr) do j=1,10 timing = secnds(0.0) do i=1,len z(i)=sqrt(x(i))+sqrt(y(i)) end do timing = secnds(timing)*1000 print *,' Timing =',timing,'/1000 s' end do end program
- Memory: using performance monitor from windows
Maior problema alocável: 150MiB * 3*4 = 1.75GiB = 1.88GB
real(4) = 1.85 GB real(8) = 3.69 GB real(16) = 7.39 GB
- Results real(4):
Debug (x32) 2.13 s Debug (x64) 2.00 s Release (x32) /O2 0.143 s Release (x64) /O2 0.140 s <=========
Release(x64) Threshold for vectorization 0 0.140 s Threshold for parallelization 0 0.140 s /Qvec- 0.909 s /Qvec- /Qparallelization 0.171 s usa 8 processors Inline directive 0.145 s /Ob1 use 4 processors /Qvec- /Qparallelization 0.232 s
- Results real(8):
Release (x64) 0.461 s /Qvec- 0.911 s /Qvec- /Qparallelization 0.356 s /Qparallelization 0.342 s <==========
- real(16): sloooow
Release (x64) 7.00 s /Qvec- 7.02 s /Qvec- /Qparallelization 1.75 s /Qparallelization 1.75 s